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(57) Abstract 

A novel gene and the protein encoded therein, i.e., dysferlin, are disclosed. This gene and its expression products are associated with 
muscular dystrophy, e.g., Miyoshi myopathy and limb girdle muscular dystrophy 2B. 
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DYSFERLIN, A GENE MUTATED IN DISTAL MYOPATHY 
AND LIMB GIRDLE MUSCULAR DYSTROPHY 

5 RELATED APPLICATION INFORMATION 

This application claims priority from provisional 
application serial no. 60/097,927, filed August 25, 1998. 

Statement as to Federally Sponsored Research 
The work described herein was supported in part by 
10 NIH grants 5P01AG12992, 5R01N834913A, and 5P01NS31248. 
The Federal Government therefore may have certain rights 
in the invention. 

Background of the Invention 
The invention relates to genes involved in the 
15 onset of muscular dystrophy. 

Muscular dystrophies constitute a heterogeneous 
group of disorders. Most are characterized by weakness 
and atrophy of the proximal muscles, although in rare 
myopathies such as "Miyoshi myopathy" symptoms may first 
20 arise in distal muscles. Of the various hereditary types 
of muscular dystrophy, several are caused by mutations or 
deletions in genes encoding individual components of the 
dystrophin-associated protein (DAP) complex. It is this 
DAP complex that links the cytoskeletal protein 
25 dystrophin to the extracellular matrix protein, laminin- 
2 . 

Muscular dystrophies may be classified according 
to the gene mutations that are associated with specific 
clinical syndromes. For example, mutations in the gene 
30 encoding the cytoskeletal protein dystrophin result in 

either Duchenne's Muscular Dystrophy or Becker's Muscular 
Dystrophy, whereas mutations in the gene encoding the 
extracellular matrix protein merosin produce Congenital 
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Muscular Dystrophy. Muscular dystrophies with an 
autosomal recessive mode of inheritance include "Miyoshi 
myopathy" and the several limb-girdle muscular 
dystrophies (LGMD2) . Of the limb-girdle muscular 
5 dystrophies, the deficiencies resulting in LGMD2C, D, E, 
and F result from mutations in genes encoding the 
membrane-associated sarcoglycan components of the DAP 
complex . 

Summary of the Invention 

10 A novel protein, designated dysferlin, is 

identified and characterized. The dysferlin gene is 
normally expressed in skeletal muscle cells and is 
selectively mutated in several families with the 
hereditary muscular dystrophies, e.g., Miyoshi myopathy 

15 (MM) and limb girdle muscular dystrophy-2B (LGMD2B) . 

These characteristics of dysferlin render it a candidate 
disease gene for both MM and LGMD2B. An additional novel 
protein, brain-specific dysferlin, has also been 
identified. Defects in brain-specific dysferlin may 

20 predispose to selected disorders of the central nervous 
system. Moreover, the expression of brain- specif ic 
dysferlin may be important as a marker for normal neural 
development (e.g., in vivo or in neural cells in 
culture) . Manipulation of levels of expression of brain- 

25 specific dysferlin, and of the type of expressed brain- 
specific dysferlin is of use for analyzing the function 
of brain-specific dysferlin and related dysferlin- 
associated molecules. 

The invention features an isolated DNA which 

3 0 includes a nucleotide sequence hybridizing under 

stringent hybridization conditions to a strand of SEQ ID 
NO: 3 or SEQ ID NO: 117. 
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The invention also features an isolated DNA 
including a nucleotide sequence selected from SEQ ID 
NOs :4-12 . 

Also within the invention is an isolated DNA 
5 comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NOs : 22-30. 

Also within the invention is a single stranded 
oligonucleotide of 14-50 nucleotides in length having a 
nucleotide sequence identical to a portion of a strand of 

10 SEQ ID NO: 3 . 

Also within the invention is a pair of PCR primers 

consisting of: 

(a) a first single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the sense 

15 strand of SEQ ID NO: 117; and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the 
antisense strand of SEQ ID NO: 117, wherein the sequence 
of at least one of the oligonucleotides is identical to a 

20 portion of a strand of SEQ ID NO : 3 , and the first 
oligonucleotide is not complementary to the second 
oligonucleotide . 

Also within the invention is a pair of single 
stranded oligonucleotides selected from of SEQ ID NOs 

25 130-231, SEQ ID NO:110, and SEQ ID N0:112. 

Also within the invention is an isolated DNA 
including a nucleotide sequence that encodes a protein 
that shares at least 70% sequence identity with SEQ ID 
NO: 2, or a complement of the nucleotide sequence. 

3 0 Also within the invention is an isolated DNA 

including a nucleotide sequence which hybridizes under 
stringent hybridization conditions to a strand of a 
nucleic acid, the nucleic acid having a sequence selected 
from SEQ ID NOs: 31-79 and 90-101. 
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Also within the invention is a single stranded 
oligonucleotide of 14-50 nucleotides in length having a 
nucleotide sequence which is identical to a portion of a 
strand of a nucleic acid selected from SEQ ID NOs : 31-79 
5 and 90-100. 

Also within the invention is a pair of PCR primers 
consisting of: 

(a) a first single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the sense 

10 strand of a nucleic acid selected from SEQ ID NOs : 31-85; 
and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the 
antisense strand of a nucleic acid selected from SEQ ID 

15 NOs: 31-85, wherein the sequence of at least one of the 
oligonucleotides includes a sequence identical to a 
portion of a strand of a nucleic acid selected from SEQ 
ID NOs : 31-79 and 90-100, and the first oligonucleotide 
is not complementary to the second oligonucleotide. 

20 Also within the invention is a pair of single 

stranded oligonucleotides selected from SEQ ID NOs 101- 
116, SEQ ID NOs 184-185, SEQ ID NOs 188-191, SEQ ID NOs 
210-213, and SEQ ID NOs 216-217. 

Also within the invention is a substantially pure 

2 5 protein that has an amino acid sequence sharing at least 

70% sequence identity with SEQ ID NO:2. 

Also within the invention is a substantially pure 
protein the sequence of which includes amino acid 
residues 1-500, 501-1000, 1001-1500, or 1501-2080 of SEQ 

3 0 ID NO: 2. 

Also within the invention is a substantially pure 
protein including the amino acid sequence of SEQ ID 
NO: 86, SEQ ID NO: 87, SEQ ID NO: 88, or SEQ ID NO: 89. 

In another aspect, the invention features a 
3 5 transgenic non-human mammal having a transgene disrupting 
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or interfering with the expression of a dysferlin gene, 
the transgene being chromosomal ly integrated into the 
germ cells of the animal. 

Another embodiment of the invention features a 
5 method of decreasing the symptoms of muscular dystrophy 
in a mammal by introducing into a cell of the mammal 
(e.g., a muscle cell or a muscle precursor cell) an 
isolated DNA which hybridizes under stringent 
hybridization conditions to a strand of SEQ ID NO : 3 . 

10 Another aspect of the invention provides a method 

for identifying a patient, a fetus, or a pre-embryo at 
risk for having a dysf erlin-related disorder by (a) 
providing a sample of genomic DNA from the patient, 
fetus, or pre-embryo; and (b) determining whether the 

15 sample contains a mutation in a dysferlin gene. 

In another aspect, the invention provides a method 
for identifying a patient, a fetus, or a pre-embryo at 
risk for having a dysf erlin-related disorder by (a) 
providing a sample including dysferlin mRNA from the 

20 patient, fetus, or pre-embryo; and (b) determining 
whether the dysferlin mRNA contains a mutation. 

Methods of identifying mutations in a dysferlin 
sequence are useful for predicting (e.g., predicting 
whether an individual is at risk for developing a 

25 dysf erlin-related disorder) or diagnosing disorders 
associated with dysferlin, e.g., MM and LGMD2B. Such 
methods can also be used to determine if an individual, 
fetus, or a pre-embryo is a carrier of a dysferlin 
mutation, for example in screening procedures. Methods 

30 which distinguish between different dysferlin alleles 

(e.g., a mutant dysferlin allele and a normal dysferlin 
allele) can be used to determine carrier status. 

The invention also features an isolated nucleic 
acid comprising a nucleotide sequence which hybridizes 

35 under stringent hybridization conditions to nucleic acids 
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3284-3720 of SEQ ID NO:232, or the complement of the 
nucleotide sequence. An isolated nucleic acid including 
a nucleotide sequence identical to the sequence of 
nucleotides 3284-3720 of SEQ ID NO: 232, or a complement 
5 of the nucleotide sequence is also a feature of the 
invention. The isolated nucleic acid can include the 
entire sequence of SEQ ID NO: 232 or the complement of SEQ 
ID NO: 232 . 

Another aspect of the invention features an 
10 isolated polypeptide that includes: a) at least 15 
contiguous amino acids of the polypeptide comprising 
amino acids 1-24 of SEQ ID NO: 233, b) a naturally 
occuring allelic variant of a polypeptide comprising 
amino acids 1-24 of SEQ ID NO: 233, or c) an amino 

15 acid sequence which is encoded by a nucleic acid molecule 
which hybridizes under stringent conditions to 
nucleotides 3284-3720 of SEQ ID NO: 232. The polypeptide 
of this aspect can include the entire sequence of SEQ ID 
NO : 2 3 3 . 

2 0 Also included in the invention is a vector 

comprising the nucleic acid of claim 44 and a cell that 
contains the vector. Another aspect of the invention 
features a method of making a polypeptide by culturing 
the cell which contains the vector. 
25 The invention also features an antibody which 

specifically binds to a polypeptide of such as those 
described above. The antibody can bind to a polypeptide 
selected from amino acids 253-403 of SEQ ID NO: 233, amino 
acids 624-865 of SEQ ID NO:233, and amino acids 1664-1786 

3 0 of SEQ ID NO: 233. Antibodies of the invention can be 

monclonal or polyclonal antibodies. 

An "isolated DNA" is DNA which has a naturally 
occurring sequence corresponding to part or all of a 
given gene but is free of the two genes that normally 
35 flank the given gene in the genome of the organism in 
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which the given gene naturally occurs. The term 
therefore includes a recombinant DNA incorporated into a 
vector, into an autonomously replicating plasmid or 
virus, or into the genomic DNA of a prokaryote or 
5 eukaryote. It also includes a separate molecule such as 
a cDNA, a genomic fragment, a fragment produced by 
polymerase chain reaction (PCR) , or a restriction 
fragment, as well as a recombinant nucleotide sequence 
that is part of a hybrid gene, i.e., a gene encoding a 

10 fusion protein. The term excludes intact chromosomes and 
large genomic segments containing multiple genes 
contained in vectors or constructs such as cosmids, yeast 
artificial chromosomes (YACs) , and Pl-derived artificial 
chromosome (PAC) contigs. 

15 A "noncoding sequence" is a sequence which 

corresponds to part or all of an intron of a gene, or to 
a sequence which is 5' or 3' to a coding sequence and so 
is not normally translated. 

An expression control sequence is "operably 

20 linked" to a coding sequence when it is within the same 
nucleic acid and can control expression of the coding 
sequence . 

A "protein" or "polypeptide" is any chain of amino 
acids linked by peptide bonds, regardless of length or 

25 post-translational modification, e.g., glycosylat ion or 
phosphorylation . 

As used herein, the term "percent sequence 
identity" means the percentage of identical subunits at 
corresponding positions in two sequences when the two 

30 sequences are aligned to maximize subunit matching, i.e., 
taking into account gaps and insertions. For purposes of 
the present invention, percent sequence identity between 
two polypeptides is to be determined using the Gap 
program and the default parameters as specified therein. 

3 5 The Gap program is part of the Sequence Analysis Software 
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Package of the Genetics Computer Group, University of 
Wisconsin Biotechnology Center, 1710 University Avenue, 
Madison, WI 53705. 

The algorithm of Myers and Miller, CABIOS (1989) 
5 can also be used to determine whether two sequences are 
similar or identical . Such an algorithm is incorporated 
into the ALIGN program (version 2.0) which is part of the 
GCG sequence alignment software package. When utilizing 
the ALIGN program for comparing amino acid sequences, a 

10 PAM120 weight residue table, a gap length penalty of 12, 
and a gap penalty of 4 can be used. 

As used herein, the term " stringent hybridization 
conditions" means the following DNA hybridization and 
wash conditions: hybridization at 60°C in the presence 

15 of 6 x SSC, 0.5% SDS, 5 x Denhardt ' s Reagent, and 100 
/ig/ml denatured salmon sperm DNA; followed by a first 
wash at room temperature for 2 0 minutes in 0.5 x SSC and 
0.1% SDS and a siecond wash at 55°C for 30 minutes in 0.2 
x SSC and 0.1% SDS. 

20 A "substantially pure protein" is a protein 

separated from components that naturally accompany it. 
The protein is considered to be substantially pure when 
it is at least 60%, by dry weight, free from the proteins 
and other naturally-occurring organic molecules with 

25 which it is naturally associated. Preferably, the purity 
of the preparation is at least 75%, more preferably at 
least 90%, and most preferably at least 99%, by weight. 
A substantially pure dysferlin protein can be obtained, 
for example, by extraction from a natural source, by 

3 0 expression of a recombinant nucleic acid encoding a 

dysferlin polypeptide, or by chemical synthesis. Purity 
can be measured by any appropriate method, e.g., column 
chromatography, polyacrylamide gel electrophoresis, or 
HPLC analysis. A chemically synthesized protein or a 

3 5 recombinant protein produced in a cell type other than 
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the cell type in which it naturally occurs is, by 
definition, substantially free from components that 
naturally accompany it. Accordingly, substantially pure 
proteins include those having sequences derived from 
5 eukaryotic organisms but which have been recombinantly 
produced in E. coli or other prokaryotes . 

An antibody that "specifically binds" to an 
antigen is an antibody that recognizes and binds to the 
antigen, e.g., a dysferlin polypeptide, but which does 
10 not substantially recognize and bind to other molecules 
in a sample (e.g., a biological sample) which naturally 
includes the antigen, e.g., a dysferlin polypeptide. An 
antibody that "specifically binds" to dysferlin is 
sufficient to detect a dysferlin polypeptide in a 
15 biological sample using one or more standard 

immunological techniques (for example, Western blotting 
or immunoprecipitation) . 

A "transgene" is any piece of DNA, other than an 
intact chromosome, which is inserted by artifice into a 
20 cell, and becomes part of the genome of the organism 
which develops from that cell. Such a transgene may 
include a gene which is partly or entirely heterologous 
(i.e., foreign) to the host organism, or may represent a 
gene homologous to an endogenous gene of the organism. 
25 Unless otherwise defined, all technical and 

scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art 
to which this invention belongs. Methods and materials 
similar or equivalent to those described herein can be 
30 used in the practice or testing of the present invention. 
The present materials, methods, and examples are 
illustrative only and not intended to be limiting. All 
publications, patent applications, patents, and other 
references mentioned herein are incorporated by reference 
35 in their entirety. In case of conflict, the present 
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specification, including definitions, will control. All 
the sequences disclosed in the sequence listing are meant 
to be double- stranded except the sequences of 
oligonucleotides . 
5 Other features and advantages of the invention 

will be apparent from the following detailed description, 
and from the claims. 

Brief Description of the Drawings 
Fig. 1A is a physical map of the MM locus. Arrows 

10 indicate the five new polymorphic markers and filled, 

vertical rectangular boxes indicate the previously known 
polymorphic markers. The five ESTs that are expressed in 
skeletal muscle are highlighted in bold. Detailed 
information on the minimal tiling path of the PAC contig 

15 spanning the MM/LGMD2B region is provided in Liu et al . , 
1998, Genomics 49:23-29. The minimal candidate MM region 
is designated by the solid bracket (top) and compared to 
the previous candidate region (dashed bracket) . TGFA and 
ADD 2 are transforming growth factor alpha and /3-adducin 

20 2 . 

Fig. IB is a representation of the dysferlin cDNA 
clones . The probes used in the three successive screens 
are shown in bold (130347, cDNAlO, A27-F2R2) . The two 
most 5' cDNA clones are also shown (B22, B33). The 6.9 

25 kb cDNA for dysferlin (SEQ ID NO:l) is illustrated at the 
bottom with start and stop codons as shown. 

Fig. 1C is a representation of the predicted 
dysferlin protein. The locations of four C2 domains 
(SEQ ID NOs : 86-89) are indicated by stippled boxes, 

3 0 while the putative transmembrane region is hatched. 

Vertical lines above the cDNA denote the positions of the 
mutations in Table 2; the associated labels indicate the 
phenotypes (MM - Miyoshi myopathy; LGMD - limb girdle 
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muscular dystrophy; DMAT - distal myopathy with anterior 
tibial onset) . 

Fig. 2 is the sequence of the predicted 2,080 
amino acids of dysferlin (SEQ ID NO: 2) . The predicted 
5 membrane spanning residues are in bold at the carboxy 
terminus (residues 2047-2063) . Partial C2 domains are 
underlined. Bold, underlined sequences are putative 
nuclear targeting residues. Possible membrane retention 
sequences are enclosed within a box. 

10 Fig. 3 is a comparison of the Kyle-Doolittle 

hydrophobicity plots of the dysferlin protein and fer-1. 
On the Y-axis, increasing positivity corresponds to 
increasing hydrophobicity. Both proteins have a single, 
highly hydrophobic stretch at the carboxy terminal end 

15 (arrow) . Both share regions of relative hydrophilicity 
approximately at residue 1,000 (arrowhead). 

Fig. 4 is a SSCP analysis of a representative 
pedigree with dysferlin mutations. Each member of the 
pedigree is illustrated above the corresponding SSCP 

20 analysis. For each affected individual (solid symbols) 
shifts are evident in alleles 1 and 2, corresponding 
respectively to exons 36 and 54. As indicated, the 
allele 1 and 2 variants are transmitted respectively from 
the mother and the father. The two affected daughters in 

25 this pedigree have the limb girdle muscular dystrophy 
( LGMD ) phenotype while their affected brother has a 
pattern of weakness suggestive of Miyoshi myopathy (MM) . 

Fig. 5 is a representation of the genomic 
structure of dysferlin. The 55 exons of the dysferlin 

3 0 gene and their corresponding SEQ ID NOs are indicated 

below the 6911 bp cDNA (solid line) . The cDNA sequences 
corresponding to SEQ ID NO : 1 and SEQ ID NO : 3 are shown 
relative to the 6911 bp cDNA. 

Figs. 6A-B are the cDNA sequence of brain-specific 

35 dysferlin (SEQ ID NO: 232) and the predicted amino acid 
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sequence (in single-letter code) of brain-specific 
dysferlin (SEQ ID NO:233). 

Detailed Description 
The Miyoshi myopathy (MM) locus maps to human 
5 chromosome 2pl2-14 between the genetic markers D2S2 92 and 
D2S286 (Bejaoui et al . , 1995, Neurology 45 : 768-72 ) . 
Further refined genetic mapping in MM families placed the 
MM locus between markers GGAA-P74 3 0 and D2S210 9 (Bejaoui 
et al . , 1998, Neurogenetics 1:189-96). Independent 

10 investigation has localized the limb-girdle muscular 

dystrophy (LGMD-2B) to the same genetic interval (Bashir 
et al . , 1994, Hum. Molec. Genetics 3:455-57; Bashir et 
al . , 1996, Genomics 33:46-52; Passos-Bueno et al . , 1995, 
Genomics 27:192-95). Furthermore, two large, inbred 
. 15 kindreds have been described whose members include both 
MM and LGMD2B patients (Weiler et al . , 1996, Am. J. Hum. 
Genet. 59:872-78; Illarioshkin et al . , 1997, Genomics 
42:345-48). In these familial studies, the disease 
gene(s) for both MM and LGMD2B mapped to essentially the 

20 same genetic interval. Moreover, in both pedigrees, 

individuals with MM or LGMD2B phenotypes share the same 
haplotypes . This raises the intriguing possibility that 
the two diseases may arise from the same gene defect and 
that a particular disease phenotype is the result of 

25 modification by additional factors. 

A 3 -Mb PAC contig spanning the entire MM/LGMD2B 
candidate region was recently constructed to facilitate 
the cloning of the MM/LGMD2B gene(s) (Liu et al . , 1998, 
Genomics 49:23-29). This high resolution PAC contig 

3 0 resolved the discrepancies of the order of markers in 
previous studies (Bejaoui et al . , 1998, Neurogenetics 
1:189-96; Bashir et al . , 1996, Genomics 33:46-52; Hudson 
et al., 1995, Science 270:1945-54). The physical size of 
the PAC contig also indicated that the previous minimal 



BNSDOCID: <WO. 



0011157A1_L> 



WO 00/11157 



PCT/US99/19395 



- 13 - 

size estimation based on YAC mapping data was 
significantly underestimated. 

Identification of Repeat Sequences and Repeat Typing 

The PAC contig spanning the MM/LGMD2B region (Liu 
5 et al - , 1998, Genomics 49:23-29) was used as a source for 
the isolation of new informative markers to narrow the 
genetic interval of the disease gene(s). DNA from the 
PAC clones spanning the MM/LGMD2B region was spotted onto 
Hybond N+™ membrane filters (Amersham, Arlington Heights, 

10 IL) . The filters were hybridized independently with the 
following y- 32 P (Du Pont, Wilmington, DE) labeled repeat 
sequences: (1) (CA) 15 ; (2) pool of (ATT) 10 , (GATA) e and 
(GGAA) 8 ; (3) pool of ( GAAT ) 8 , (GGAT) 8 and (GTAT) 8 ; and (4) 
pool of (AAG) 10 and (ATC) 10 . Hybridization and washing of 

15 the filters were carried out at 55°C following standard 
protocols (Sambrook et al . , 1989, Molecular Cloning: A 
Laboratory Manual (2nd Edition) , Cold Spring Harbor 
Press, N. Y. ) . 

Miniprep DNAs of PAC clones containing repeat 

20 sequences were digested with restriction enzymes Hindlll 
and PstI and ligated into pBluescript II (KS+) vector 
which is (Stratagene, La Jolla, CA) digested with the 
same enzymes. Filters of the PAC subclones were 
hybridized to the y- 32 P labeled repeats that detected the 

25 respective PACs . For clones with an insert size greater 
than 1 kb the repeat sequences of which could not be 
identified by a single round of sequencing, the inserts 
were further subcloned by digestion with Haelll and 
ligation in EcoRV-digested pZero-2.1 vector (Invitrogen, 

30 Inc., Carlsbad, CA) . Miniprep DNAs of the positive 
subclones were subjected to manual dideoxy sequencing 
with Sequenase™ enzyme (US Biochemicals , Inc., Cleveland, 
OH) . Primer pairs for amplifying the repeat sequences 
were selected using the computer program Oligo (Version 
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4.0, National Biosciences, Inc., Plymouth, MN) . Primer 
sequences are shown in Table 1. 



BNSDOCID: <WO 0011157A1J_> 



WO 00/11157 



PCT/US99/19395 



15 



o 

•H 
CD 
CD 

Pti 
PQ 

CN 



o 
u 

x» 

CD 
P, 

a. 

in 

0) 
Jh 

Cd 
u 

•H 



Q 



O 
CM 

CD 
S3 



O 
S rd 



-H ,Q 

<l) 

N CJ 

-H < 

CO Pu 



in 

8 

u 
a* 



Cd 
0) 

a 
cd 



u 



CN 
00 



CN 

r- 



CO 

oo 



CJ 

u 
a 

H 

CJ 



8 

CJ 

a 
u 
o 



o — < ~ 

H o O H 



O H 
H ■• 

u o 
u 53 
u 



<C CM 
U tH 

B 



CM 

in 

W 
i 

m 



o 



H W O W 
w H M 

o u — 



in 



u 

H 
H 

a 



H 
O 

u 

H 
H 



h — a — 

O <n <C m 

" CN CJ CN 
tH CJ iH 

H 

OHO 
SOS 

a 

a h a 

HUH 
H 

a oo a 

H W H W 
U CO O CO 
CJ — ' H — 



3 



CN 

m 
i 

CN 

r- 

tH 

>* 

CJ 



VD 

in 



8 



w 

i 

in 
m 



CM 

m 



2 U 

o u 

H H 

O ^ S in 

H OJ U (N 

2 .. H 

< o u o 

o a a s 

H Q CJ Q 
H H M 

3oho 
u wo w 

U CO O CO 



o 
oo 

CN 



oo 
in 



O 
H 
CJ 
CJ 

u 

H 



o 

H 
H 



O 



cj — o — 

O vd H r> 
<C cn U CM 
OHO' 

< H 
CJ O O 

BUS 

a a *C 

O H U H 

a o< a 
uwow 

< co H co 
H — H — 



CD 



0 

u 



I 

<D 



L0 



H 
H 

O CM H CM 
H H O H 
H H 
CJ O O 

o s a s 

H H 
H O < Q 

oouo 
< w o w 

O CO O CO 
CD — H ~ 



•H 

cn 
co 
u 

S 

u 



0) 

c 

0) 

cn 
c 

*H 

CD 

4J 

M-l 
O 



O 

-H CM 

ftp 

4-> I 
GO 
PS VD 

CJ r> 
in 



M O 
- P 

> 

cd in 
U o\ 
O o> 



10 

•H d 

in o 

O CO 
rH -n 
O Q> 

o 

is 

0) CU 

CCj TJ 
U Q) 
U)X) 
CD -H 
JJ U 
C U 
M CO 
<D 

■°>, 

rH 
CD W 



CM 



a? 

Pm 

I 



(U 
4J 

cd • C 

rH X O 
<U 0> M 

u *a -u 
a c a 

^ -H -H 

o >iC 
m 4-J -h 

a cn 4J 

-H O "H 

Cn > 

4) NT) 

> o a> 

U U XJ 

(D <u cd 
cn xj u 

S w 3 



CD > 
0) 

CD 5 
M O 

<D rH 
5 rH 

o 

cn m-» 

CD 

•H M 
4J CD 
O ^ 
0) M 
rH Cd 

i E 

o m 
rj» a) 

*H & 
rH CD 
O U 

rH CD 

rH x: 



BNSDOCID: <WO_ 



_0011157A1J_> 



WO 00/11157 



PCT/US99/19395 



- 16 - 

Identification of Repeat Markers and Haplotvpe Analysis 

After hybridization with labeled repeat oligos, 
17 different groups of overlapping PACs were identified 
that contained repeat sequences . Some groups contained 
5 previously identified repeat markers. For example, five 
groups of PACs were positively identified by a pool of 
repeat probes including (ATT) 10 , (GATA) 8 , and (GGAA) a . Of 
these, three groups contained known markers GGAA-P743 0 
(GGAA repeat) , D2S13 94 (GATA repeat) and D2S13 98 (GGAA 

10 repeat) (Hudson et al . , 1992, Nature 13:622-29; Gastier 
et al . , 1995, Hum. Molecular Genetics 4:1829-36). No 
attempt was made to isolate new repeat markers from these 
PACs and they were not further analyzed. Similarly, 
seven groups of PACs that contained known CA repeat 

15 markerswere excluded. Seven groups of PACs that 

contained unidentified repeats were retained for further 
analysis. For each group, the PAC containing the 
smallest insert was selected for subcloning. Subclones 
were re-screened and positive clones were sequenced to 

20 identify repeats. In total, seven new repeat sequences 
were identified within the MM/LGMD2B PAC contig. Of 
these, five are polymorphic within the population that 
was tested. The information for these five markers is 
summarized in Table 1. Based on the PAC contig 

2 5 constructed previously across the MM candidate locus (Liu 
et al . , 1998, Genomics 48:23-29), the five new markers 
and ten previously published polymorphic markers were 
placed in an unambiguous order (Fig. 1) . 

These markers were analyzed in a large, 

30 consanguineous MM family (Bejaoui et al . , 1995, Neurology 
45: 768-72; Bejaoui et al . , 1998, Neurogenetics 1:189- 
96) . Because MM is a recessive condition, the locus can 
be defined by identifying regions of the genome that show 
homozygosity in affected individuals. Conversely, 

35 because of the high penetrance of this adult-onset 
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condition, unaffected adult individuals are not expected 
to be homozygous by descent across the region. Analysis 
of haplotype homozygosity in this pedigree indicates that 
the disease gene lies between markers D2S2111 and PAC3- 
5 H52 . Based on the PAC mapping data, the physical 

distance for this interval is approximately 2.0 Mb. No 
recombination events were detected between four 
informative markers (markers cyl72-H32 to PAC16-H41) and 
the disease locus in family MM-21 (Fig. 1A) . 

10 Identification of Five Muscle-Expressed ESTs 

Twenty- two ESTs and two genes (transforming growth 
factor alpha [TGFo?] and beta-adducin [ADD2] ) were 
previously mapped to the MM/LGMD2B PAC contig (Fig. 1A) 
(Liu et al . , 1998, Genomics 48:23-29). Two ptl 

15 (approximately 0.1 ng/jil) of Marathon -ready™ skeletal 
muscle cDNA (Clontech, Palo Alto, CA) were used as 
template in a 10 fil PCR reaction for analysis of muscle 
expression of ESTs. The PCR conditions were the same as 
for the PCR typing of repeat markers. PCR analysis of 

2 0 skeletal muscle cDNA indicated that five of these ESTs 

(A006G04, stSG1553R, WI-14958, TIGR-A004Z44 and WI-14051) 
map within the minimal genetic MM interval of MM and are 
expressed in skeletal muscle. 

Probes were selected corresponding to each of 
25 these five ESTs for Northern blot analysis. cDNA clones 
(130347, 48106, 172575, 184080, and 510138) corresponding 
to the five ESTs that are expressed in muscle 
(respectively TIGR-A004Z44 , WI-14051, WI-14958, stSG1553R 
and A006G04) were selected from the UniGene database 

3 0 (http: /www. ncbi .nlm.nih.gov/UniGene/) and obtained from 

Genome Systems, Inc. (St. Louis, MO). The cDNA probes 
were first used to screen the MM/LGMD2B PAC filters to 
confirm that they mapped to the expected position in the 
MM/LGMD2B contig. 
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A Northern blot (Clontech) of multiple human 
tissues was sequentially hybridized to the five cDNA 
probes and a control /3-actin cDNA at 65°C following 
standard hybridization and washing protocols (Sambrook et 
5 al . , supra). Between hybridizations, probes were removed 
by boiling the blot at 95-100°C for 4-10 min with 0.5% 
SDS. The blot was then re-exposed for 24 h to confirm 
the absence of previous hybridization signals before 
proceeding with the next round of hybridization. 

10 The tissue distribution, intensity of the signals 

and size of transcripts detected by the five cDNA probes 
varied. Probes corresponding to ESTs stSG1553R, TIGR- 
A004Z44 and WI-14958 detected strong signals in skeletal 
muscle. In addition, the cDNA corresponding to TIGR- 

15 A004Z44 detected a 3.6-3.8 kb brain-specific transcript 
instead of the 8.5 kb message that was present in other 
tissues. It is likely that these five ESTs correspond to 
different genes since the corresponding cDNA probes used 
for Northern analysis derive from the 3' end of messages, 

20 map to different positions in the MM/LGMD2B contig (Fig. 
1A) , and differ in their expression patterns. 

Current database analysis suggests that three of 
these ESTs (stSG1553R, WI-14958 and WI-14051) do not 
match any known proteins (Schuler et al . , 1996, Science 

25 274:540-46). A006G04 has weak homology with a protein 
sequence of unknown function that derives from C. 
elegrans. TIGR-A0 04Z4 4 has homology only to subdomains 
present within protein kinase C. Because the five genes 
corresponding to the ESTs are expressed in skeletal 

3 0 muscle and map within the minimal genetic interval of the 
MM/LGMD2B gene(s), they are candidate MM/LGMD2B gene(s). 
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Cloning of Dvsferlin cDNA 

EST TIGR-A004Z44 gave a particularly strong skeletal 
muscle signal on the Northern blot. Moreover, it is 
bracketed by genetic markers that show no recombination 
5 with the disease phenotype in family MM- 21 (Fig. 1) . The 
corresponding transcript was therefore cloned and 
analyzed as a candidate MM gene. From the Unigene 
database, a cDNA IMAGE clone (130347, 979 bp) was 
identified that contained the 483 bp EST TIGR-A004Z44 . 

10 Approximately 1 x 10 6 recombinant clones of a Xgtll 

human skeletal muscle cDNA library (Clontech) were plated 
and screened following standard techniques (Sambrook et 
al . , supra). The initial library screening was performed 
using the insert released from the clone 130347 that 

15 contains EST TIGR-A0044Z44 , corresponding to the 3' end 
of the gene. Positive phages were plaque purified and 
phage DNA was isolated according to standard procedures 
(Sambrook et al . , supra) . The inserts of the positive 
clones were released by EcoRI digestion of phage DNA and 

2 0 subsequently subcloned into the EcoRI site of pBluescript 
II (KS + ) vector (Stratagene) . 

Fifty cDNA clones were identified when a human 
skeletal muscle cDNA library was screened with the 130347 
cDNA. Clone cDNAlO with the largest insert (-6.5 kb) 

25 (Fig. IB) was digested independently with BamHI and PstI 
and further subcloned into pBluescript vector. Miniprep 
DNA of cDNA clones and subclones of cDNAlO was prepared 
using the Qiagen plasmid Miniprep kit (Valencia, CA) . 
Sequencing was carried out from both ends of each clone 

30 using the SequiTherm EXCEL™ long-read DNA sequencing kit 
(Epicenter, Madison, WI) , fluorescent-labeled M13 forward 
and reverse primers, and a LI -COR sequencer (Lincoln, 
NE) . Assembly of cDNA contigs and sequence analysis were 
performed using Sequencher software (Gene Codes 

35 Corporation, Inc., Ann Arbor, MI). 
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Two additional screens, first with the insert of 
cDNAlO and then a 6 83 bp PCR product (A2 7-F2R2) amplified 
from the 5' end of the cDNA contig, identified 87 
additional cDNA clones. Clones B22 and B33 extended the 
5 5' end by 94 and 

20 bp, respectively. The compiled sequence allowed for 
the generation of a sequence of 6.9 kb (SEQ ID NO:l) 
(with 10 -fold average coverage) . 

Although the 5' end of the gene has not been further 

10 extended to the 8.5 kb predicted by Northern analysis, an 
open reading frame (ORF) of 6,243 bp has been identified 
within this 6.9 kb sequence. This ORF is preceded by an 
in- frame stop codon and begins with the sequence 
cgcaagcATGCTG (SEQ ID NO:118); five of the first seven bp 

15 are consistent with the Kozak consensus sequence for a 
start codon (Kozak, 1989, Nucl . Acids Res. 15:8125-33; 
Kozak, 1989, J". Cell. Biol. 108:229-41). An alternate 
start codon, in the same frame, +75 bp downstream, 
appears less likely as a start site GAGACGATGGGG (SEQ ID 

20 NO:119). Thus, the entire coding region of this 

candidate gene is believed to have been identified, as 
represented by the 6.9 kb sequence contig. 

Isolation of the Brain-Specific Dvsferlin Isoform 

Identification of the brain- specif ic isoform of 

2 5 dvsferlin 

A brain-specific isoform of dysferlin was identified 
using Northern blot analysis of poly(A+)RNA derived from 
multiple human adult tissues probed with radiolabeled 
full-length dysferlin cDNA subclones. A prominent 7.2 kb 

30 transcript was detected on Northern blots in skeletal 
muscle, heart, placenta, lung, and kidney, while a 
distinct but equally prominent 3.6 kb-3.8 kb transcript 
was identified exclusively in the brain. Using long 
exposures, a faint 7.2 kb mRNA was also detected in the 
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brain. This finding suggested that the shorter brain 
isoform was likely to be a tissue-specific splice variant 
of the dysferlin gene. To test this hypothesis, a human 
brain cDNA library (Stratagene) was screened for the 
5 dysferlin brain isoform. 

Cloning of the brain-specific dysferlin isoform 
To identify probes that hybridize to the brain- 
specific dysferlin sequence and so could be used for 
library screening, fragments of the full-length dysferlin 

10 cDNA clone (derived from a skeletal muscle cDNA library) 
were generated using restriction enzymes. The fragments 
were about 1 kb in length and were analyzed by 
hybridization to a Northern blot that included brain RNA. 
Sequences suitable for library screening were those that 

15 hybridized to the 3.6-3.8 kb brain-specific transcript. 
A region of the 3' end of the dysferlin cDNA sequence 
that is approximately 3 kb in length was identified as 
hybridizing to brain mRNA. DNA containing sequence from 
this region was used as a probe for hybridization 

2 0 screening of a human brain cDNA library (Stratagene) . 

The human brain cDNA library was plated out and 
screened using standard procedures. Of the approximately 
720,000 plaques screened, 63 primary positive clones were 
identified. Of these, 20 clones were selected for 

2 5 further analysis involving standard methods of 
hybridization, restriction enzyme mapping, and 
sequencing. The primary positive clones shared regions 
of overlap with each other. 

Sequencing of positive clones, provided 3 671 

30 nucleotides of the brain-specific dysferlin sequence (SEQ 
ID NO:232; Figure 6A-B) . The identified sequence 
corresponds closely to the size of the brain-specific 
dysferlin transcript detected on Northern blots. With 
the exception of the 5' region of the sequence, the 
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brain-specific sequence is identical to about 3.1 kb of 
the dysferlin sequence (from nucleotide 3722 to 6904 of 
the dysferlin sequence) . In the dysferlin gene, position 
3722 corresponds to the start of exon 32. This finding 
5 is consistent with the hypothesis that the brain isoform 
is a splice-variant of the dysferlin gene. At the 5' end 
of the brain isoform, 489 nucleotides are unique to 
brain- specif ic dysferlin. The amino acid sequence 
encoded by the brain dysferlin nucleic acid sequence (SEQ 
10 ID NO: 233; Figure 6) contains a unique sequence with an 
initiation codon within a Kozak consensus sequence. The 
nucleic acid sequence unique to brain-specific dysferlin 
encodes a novel 24 amino acid sequence. 

Identification of Mutations in Mivoshi Myopathy 

15 Two strategies were used to determine whether this 

6.9 kb cDNA (SEQ ID NO : 1 ) is mutated in MM. First, the 
genomic organization of the corresponding gene was 
determined and the adjoining intronic sequence at each of 
the 55 exons which make up the cDNA was identified. To 

2 0 identify exon-intron boundaries within the gene, PAC DNA 
was extracted with the standard Qiagen -Mini Prep 
protocol . Direct sequencing was performed with DNA 
Sequence System (Promega, Madison, WI) using 32 P end- 
labeled primers (Benes et al . , 1997, Biotechniques 23:98- 

25 100) . Exon-intron boundaries were identified as the 

sites where genomic and cDNA sequences diverged. Second, 
in patients for whom muscle biopsies were available, RT- 
PCR was also used to prepare cDNA for the candidate gene 
from the muscle biopsy specimen. 

30 Single strand conformational polymorphism analysis 

(SSCP) was used to screen each exon in patients from 12 
MM families. Putative mutations identified in this way 
were confirmed by direct sequencing from genomic DNA 
using exon-specif ic intronic primers. Approximately 20 
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ng of total genomic DNA from immortalized lymphocyte cell 
lines were used as a template for PCR amplification 
analysis of each exon using primers (below) located in 
the adjacent introns . SSCP analysis was performed as 
5 previously described (Aoki et al . , 1998, Ann. Neurol. 
43:645-53) . In patients for whom muscle biopsies were 
available, mRNA was isolated using RNA-STAT-60™ (Tel- 
Test, Friendswood, TX) and first-strand cDNA was 
synthesized from 1-2 fxg total RNA with MMLV reverse 

10 transcriptase and random hexamer primers (Life 

Technologies, Gaithersburg, MD) . Three fil of this 
product were used for PCR amplification. Eight sets of 
primers were designed for muscle cDNA, and overlapping 
cDNA fragments suitable for SSCP analysis were amplified. 

15 After initial denaturation at 94°C for 2 min, 

amplification was performed using 30 cycles at 94°C for 30 
s, 56°C for 3 0 s, and 72°C for 60 s. The sequences of 
polymorphisms detected by SSCP analysis were determined 
by the dideoxy termination method using the Sequenase kit 

20 (US Biochemicals) . In some instances, the base pair 
changes predicted corresponding changes in restriction 
enzyme recognition sites. Such alterations in 
restriction sites were verified by digesting the relevant 
PCR products with the appropriate restriction enzymes. 

2 5 Primer pairs used for SSCP screening and exon 

sequencing are as follows: 

(1) exon 3, F3261 5 ' - tctcttctcctagagggccatag-3 ' (SEQ 
ID NO: 101) and R32 6 5 ' -ctgttcctccccatcgtctcatgg-3 ' (SEQ 
ID NO: 102) ; 

30 (2) exon 20, F3121 5 ' -gctcctcccgtgaccctctg- 3 ' (SEQ 

ID NO: 103) and R3121 5 ' -gggtcccagccaggagcactg- 3 ' (SEQ ID 
NO: 104) ; 

(3) exon 36, F2102 5 ' -cccctctcaccatctcctgatgtg-3 ' 
(SEQ ID NO: 105) and R2111 5 ' - tggcttcacct tccctctacctcgg- 
35 3 ' (SEQ ID NO: 106) ; 
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(4) exon 49, F1081 5 ' -tcctttggtaggaaatctaggtgg-3 ' 
(SEQ ID NO: 107) and R1081 5 ' -ggaagctggacaggcaagagg-3 ' 
(SEQ ID NO: 108) ; 

(5) exon 50, F1091 5 ' -atatactgtgttggaaatcttaatgag-3 ' 
5 (SEQ ID NO: 109) and R1091 5 ' -gctggcaccacagggaatcgg- 3 ' 

(SEQ ID NO: 110) ; 

(6) exon 51, F1101 5 ' -ctttgcttccttgcatccttctctg- 3 ' 
(SEQ ID NO: 111) and R1101 5 ' -agcccceatgtgcagaatggg-3 ' 
(SEQ ID NO: 112) ; 

10 (7) exon 52, Fllll 5 ' -ggcagtgatcgagaaacccgg-3 ' (SEQ 

ID NO: 113) and Rllll 5 ' -catgccctccactggggctgg- 3 ' (SEQ ID 
NO: 114) ; 

(8) exon 54, F1141 5 ' -ggatgcccagttgactccggg- 3 ' (SEQ ID 
NO: 115) and R1141 5 ' -ccccaccacagtgtcgtcagg-3 ' (SEQ ID NO: 

15 116) ; 

(9) exon 29, F3031 5 ' -aagtgccaagcaatgagtgaccgg- 3 ' (SEQ 
ID NO: 184) and R3021 5 ' -ctcactcccacccaccacctg- 3 ' (SEQ ID 
NO: 185) ; 

(10) exon 31, F2141 5 ' -gaatctgccataaccagcttcgtg- 3 ' (SEQ 
20 ID NO: 188) and R2141 5 ' - tatcaccccatagaggcctcgaag-3 ' (SEQ ID 

NO: 189) ; 

(11) exon 32, F2981 5 ' -cagccactcactctggcacctctg- 3 ' (SEQ 
ID NO: 190) and R2981 5 ' -agcccacagtctctgactctcctg- 3 ' (SEQ ID 
NO: 191) ; 

25 (12) exon 43, F2031 5 ' -cagccaaaccatatcaacaatg-3 ' (SEQ 

ID NO: 210) and R2021 5 ' -ctggggaggtgagggctctag-3 ' (SEQ ID 
NO: 211) ; 

(13) exon 44, F2011 5 ' -gaagtgttttgtctco.tcctc-3 ' (SEQ ID 
NO: 212) and R2011 5 ' -gcaggcagccagcccccatc-3 ' (SEQ ID NO: 

30 213) ; 

(14) exon 46, F1041 5 ' -ctcgtctatgtcttgtgcttgctc-3 1 (SEQ 
ID NO: 216) and R1051 5 ' -caccatggtttggggtcatgtgg-3 ' (SEQ ID 
NO: 217) . 
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These primers were used in SSCP screening and exon 
sequencing, and identified eighteen different mutations 
in fifteen families (Table 2) . 
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Twelve of the eighteen different mutations are predicted 
to block dysferlin expression, either through nonsense or 
frameshift changes. Seven of the thirteen samples are 
homozygous and thus expected to result in complete loss 
5 of dysferlin function. For each mutated exon in these 
patients, at least 50 control DNA samples (100 
chromosomes) were screened to determine the frequencies 
of the sequence variants. When possible, the parents and 
siblings of affected individuals were also screened to 

10 verify that defined mutations were appropriately co- 
inherited with the disease in each pedigree (Fig. 4) . In 
two families (50, 58 in Table 2) heterozygous mutations 
were identified in one allele (respectively a missense 
mutation and a 2 bp deletion) . Mutations in the other 

15 allele are presumed to have not been detected (or in 
three of the screened MM families) either because the 
mutant and normal SSCP products are indistinguishable or 
because the mutation lies outside of coding sequence 
(i.e., in the promoter or a regulatory region of an 

2 0 intron) . The disease-associated mutations did not appear 
to arise in the population as common polymorphisms. 

More mutations can be identified by using 
appropriate primer pairs to amplify an exon and analyze 
its sequence. The following primer pairs are useful for 

25 exon amplification. 

Exon Code Primer Sequence 



1 



F408 



5' -gacccacaagcggcgcctcgg-3 ' {SEQ ID 



NO: 130} 



F4101 



5' -gaccccggcgagggtggtcgg-3 ' { SEQ ID 



3 0 NO: 131} 



2 



F4111 



5' -tgtctctccattctcccttttgtg-3 ' {SEQ ID 



NO: 132} 



R4111 



5 ''-aggacactgctgagaaggcacctc-3 ' {SEQ ID 



NO: 133} 
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NO: 134} 

NO: 13 5} 
5 4 
NO: 136} 

NO: 137} 
5 

10 NO: 138} 

NO: 13 9} 
6 

NO: 14 0} 



15 



NO: 141} 
7 

NO: 142} 



F3262 



20 NO: 143} 

8 F3561 
NO: 144} 



NO: 145} 
25 9 F3551 

NO: 14 6} 



NO: 147} 

10 F3201 
30 ID NO: 148} 

R3201 

NO: 149} 

11 F3191 
ID NO: 150} 
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5-agtgccctggtggcacgaagg-3 ' { SEQ ID 



R3261 5-cctacctgcaccttcaagccatgg-3 ' {SEQ ID 



F32 51 5-cagaagagccagggtgccttagg-3' {SEQ ID 



R3251 5-ccttggaccttaacctggcagagg-3 ' {SEQ ID 



F3242 5-cgaggccagcgcaccaacctg-3 ' {SEQ ID 



R3242 5-actgccggccattcttgctggg-3 ' { SEQ ID 



F3231 5-ccaggcctcattagggccctc-3 ' {SEQ ID 

R3231 5-ctgaagaggagcctggggtcag-3 ' { SEQ ID 



F3222 5-ctgagatttctgactcttggggtg-3 ' {SEQ ID 



R3211 5-aaggttctgccctcatgccccatg-3 ' {SEQ ID 



5 -ctggcctgagggatcagcagg- 3 ' {SEQ ID 



R3561 5-gtgcatacatacagcccacggag-3 ' {SEQ ID 



5-gagctattgggttggccgtgtggg-3 ' {SEQ ID 



R3552 5-accaacacggagaagtgagaactg-3 ' {SEQ ID 



5-ccacactttatttaacgctttggcgg-3 ' {SEQ 



5 - cagaaccaaaatgcaaggatacgg- 3 ' { SEQ ID 



5-cttctgattctgggatcaccaaagg-3 ' {SEQ 
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F3191 5-ggaccgtaaggaagacccaggg-3 ' { SEQ ID 

NO: 151} 

12 F3181 5-cctgtgctcaggagcgcatgaagg-3 ' {SEQ ID 
NO: 152} 

5 R3181 5-gcagacctcccacccaagggcg-3 ' {SEQ ID 

NO: 153} 

13 F3171 5-gagacagatgggggacagtcaggg-3 ' {SEQ ID 
NO: 154} 

R3171 5-cctcccgagagaaccctcctg-3 ' { SEQ ID 

10 NO: 155} 

14 F3161 5-gggagcccagagtccccatgg-3 ' {SEQ ID 
NO: 156} 

R3161 5-gggcctccttgggtttgctgg-3 ' { SEQ ID 

NO: 157} 

15 15 F3541 5-gcctccccagcatcctgccgg-3 ' {SEQ ID 

NO: 158} 

R3541 5-tcactgagccgaatgaaactgagg~3 ' {SEQ 

ID NO: 159} 

16 F3531 5-tgtggcctgagttcctttcctgtg-3 ' {SEQ ID 
20 NO: 160} 

R3 531 5-ggtcaaagggcagaacgaagaggg-3 ' {SEQ .ID 

NO: 161} 

17 F3151 5-cccgtccttctcccagccatg-3 ' {SEQ ID 
NO: 162} 

25 R3151 5-ctcccctggttgtccccaagg-3 ' { SEQ ID 

NO: 163} 

18 F3141 5-cgacccctctgattgccacttgtg-3 ' {SEQ ID 
NO: 164} 

R3141 5-ggcatcctgcccttgccaggg-3 ' {SEQ ID 

30 NO: 165} 

19 F3522 5-tctgtctcccctgctccttg-3 ' { SEQ ID NO: 



166} 

NO: 167} 



R3522 5-cttccctgccccgacgcccag-3 ' {SEQ ID 
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20 F3121 5-gctcctcccgtgaccctctgg-3 ' { SEQ ID 
NO: 103} 

R3121 5-gggtcccagccaggagcactg-3 ' {SEQ ID 

NO: 104} 

5 21 F3111 5-cagcgctcaggcccgtctctc-3 ' {SEQ ID 

NO: 168} 

R3111 5-tgcataggcatgtgcagctttggg-3 ' { SEQ ID 

NO: 169} 

22 F3512 5-catgcaccctctgccctgtgg-3 ' {SEQ ID 
10 NO: 170} 

R3512 5-agttgagccaggagaggtggg-3 ' {SEQ ID 

NO: 171} 

23 F3101 5-catcaggcgcattccatctgtccg-3 ' {SEQ ID 
NO: 172} 

15 R3 091 5-agcaggagagcagaagaagaaagg-3 ' {SEQ ID 

NO: 173} 

24 F3082 5-gtgtgtcaccatccccaccccg-3 ' { SEQ ID 
NO: 174} 

R3 082 5-caagagatgggagaaaggccttatg-3 ' { SEQ 

20 ID NO:175} 

25 F3073 5-ctgggacatccggatcctgaagg-3 ' {SEQ ID 
NO: 176} 

R3073 5- tccaggtagtgggaggcagagg-3 ' (SEQ ID 

NO: 177} 

25 26 F3061 5 - tcccactacctggagctgccttgg- 3 ' { SEQ 

ID NO: 178} 

R3051 5-ggctctccccagccctccctg-3 ' { SEQ ID 

NO: 179} 

27 F3601 5-cagagcagcagagactctgaccag-3 ' {SEQ 
3 0 ID NO: 18 0} 

R3601 5- tagaccccacctgcccctgag-3 ' {SEQ ID 

NO: 181} 

28 F3501 5-tcctctcattgcttgcctgttcgg-3 ' {SEQ 



ID NO: 182} 
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R3501 5- ttgagagcttgccggggatgg-3 ' {SEQ ID 

NO: 183} 

29 F3031 5-aagtgccaagcaatgagtgaccgg-3 ' {SEQ 
ID NO: 184} 

5 R3021 5-ctcactcccacccaccacctg-3' {SEQ ID 

NO: 185} 

30 F3011 5-cccaccggcctctgagtctgc-3 ' {SEQ ID 
NO: 186} 

R3001 5-accctacccaagccaggacaagtg-3 ' {SEQ 

10 ID NO: 187} 

31 F2141 5-gaatctgccataaccagcttcgtg-3 ' {SEQ 
ID NO: 188} 

R2141 5- tatcaccccatagaggcctcgaag-3 ' {SEQ 

ID NO: 189} 

15 32 F2981 5 - cagccactcactctggcacctctg- 3 ' { SEQ 

ID NO: 190} 

R2 981 5-agcccacagtctctgactctcctg-3 ' {SEQ 

ID NO: 191} 

33 F2131 5-acatctctcagggtccctgctgtg-3 ' {SEQ 
2 0 ID NO: 192} 

R2211 5-cctgtgaggggacgaggcagg-3 ' {SEQ ID 

NO: 193} 

34 F2202 5-gccctgggtaagggatgctgattc-3 ' {SEQ 
ID NO: 194} 

25 R2202 5-cctgcctgggcctcctggatc-3 ' {SEQ ID 

NO: 195} 

35 F2111 5-gagggtgatgggggccttagg-3 ' {SEQ ID 
NO: 196} 

R2112 5-gcaatcagtttgaagaaggaaagg-3 ' { SEQ 

30 ID NO: 197} 

36 F2102 5-cccctctcaccatctcctgatgtg-3 ' {SEQ 
ID NO: 105} 

R2111 5-ggcttcaccttccctctacctcgg-3 ' {SEQ 

ID NO: 106} 
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37 F2101 
ID NO: 198} 

R2101 

NO: 199} 
5 38 F2091 

ID NO: 200} 

R2091 

NO: 201} 

3 9 F2081 
10 NO: 202} 

R2071 

NO: 2 03} 

40 F2061 
NO: 204} 

15 R2061 
NO: 205} 

41 F2051 
ID NO: 2 06} 

R2051 
20 ID NO: 207} 

42 F2041 
ID NO: 2 08} 

R2041 

NO: 209} 
25 43 F2031 

NO: 210} 

R2021 

NO : 211} 

44 F2011 
3 0 NO: 212} 

R2011 

NO: 213} 

45 F1021 
NO: 214} 



5-cacctttgtctccattctacctgc-3 ' {SEQ 



5-ctcccagcccccacgcccagg-3' {SEQ ID 



5-ctgagccactctcctcattctgtg-3 ' { SEQ 



5-tggaaggggacagtagggagg-3 ' {SEQ ID 



5-ggccagtgcgttcttcctcctc-3 ' {SEQ ID 



5-tccctgacctgcccatcatctc-3 ' {SEQ ID 



5-gcccctgtcaggcctggatgg-3 ' {SEQ ID 



5-tgacccaggcctccctggagg-3 ' { SEQ ID 



5-ctgaaatggtctctttctttctac-3 ' {SEQ 



5-cacaccgactgtcagactgaagag-3 ' { SEQ 



5-ttgtcccctcctctaatccccatg-3 ' { SEQ 



5-gggttagggacgtcttcgagg-3 ' {SEQ ID 



5 - cagccaaaccatatcaacaatg- 3 ' {SEQ ID 



5-ctggggaggtgagggctctag-3 ' , {SEQ ID 



5-gaagtgttttgtctcctcctc-3 ' {SEQ ID 



5-gcaggcagccagcccccatc-3 ' {SEQ ID 



5-gggtgccctgtgttggctgac-3 ' {SEQ ID 
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R1031 

NO: 215} 

46 F1041 
ID NO: 216} 

5 R1051 
NO: 217} 

47 F1061 
NO: 218} 

R1061 

10 NO: 219} 

48 F1071 
NO: 220} 

R1071 

NO: 221} 
15 49 F1082 

ID NO: 222} 

R1082 

NO: 223 } 

50 F1092 
20 ID NO : 224} 

R1091 

NO: 110} 

51 F1102 
ID NO: 225} 

25 R1101 
NO:, 112} 

52 F1112 
ID NO: 226} 

R1112 

3 0 NO: 22 7} 

53 F1121 
ID NO: 228} 

R1121 

NO: 22 9} 
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5-gcaggcagccagcccccatc-3 ' {SEQ ID 



5-ctcgtctatgtcttgtgcttgctc-3 ' {SEQ 



5-caccatggtttggggtcatgtgg-3 ' {SEQ ID 



5-tctcgcttccccagctcctgc-3 ' { SEQ ID 



5-tctggagttcgaggactctggg-3 ' {SEQ ID 



5-agaagggtggggagagaacgg-3 ' {SEQ ID 



5-cagctcagagcctgtggctgg-3 ' {SEQ ID 



5-aaggccttcccatcctttggtagg-3 ' {SEQ 



5-acaacccagagggagcacggg-3 ' {SEQ ID 



5-gttgacgatgtatatactgtgttgg-3 ' { SEQ 



5-gctggcaccacagggaatcgg-3 ' {SEQ ID 



5-gcctctctctaactttgcttccttg-3 ' { SEQ 



5-agcccccatgtgcagaatggg-3 ' {SEQ ID 



5 -ggc tacaggctggcagtgatcgag- 3 ' { SEQ 



5-ttcccccatgccctccactgg-3 ' {SEQ ID 



5-agccttcgtgcccctaaccaagtg-3 ' {SEQ 



5-ctgtgggcattggggctcagg-3 ' {SEQ ID 
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54 F1141 5-ggatgcccagttgactccggg-3 ' { SEQ ID 

NO: 115} 

R1141 5-ccccaccacagtgtcgtcagg-3 ' {SEQ ID 

NO: 116} 

5 55 F1151 5-gccccagtgggatcaccatg-3 ' {SEQ ID 

NO: 230} 

R116 5-atgctggaggggaccccacgg-3 ' {SEQ ID 

NO: 231} 

Comparison of Dvsferlin With Other Proteins 

10 The 6,243 bp ORF of this candidate MM gene is 

predicted to encode 2,080 amino acids (Figs. 1C and 2; 
SEQ ID NO: 2). At the amino acid level, this protein is 
highly homologous to the nematode {Caenorhabditis 
elegans) protein fer-1 (27% identical, 57% identical or 

15 similar: the sequence alignment and comparison was 
performed using http://vega.igh.cnrs.fr/bin/nph- 
align_query.pl.) (Argon & Ward, 1980, Genetics 96:413-33; 
Achanzar & Ward, 1997, J. Cell Science 110:1073-81). 
This dystrophy-associated, fer-l-like protein has 

20 therefore been designated "dysf erlin . " 

The fer-1 protein was originally identified through 
molecular genetic analysis of a class of fertilization- 
defective C. elegans mutants in which spermatogenesis is 
abnormal (Argon & Ward, 1980, Genetics 96:413-33). The 

25 mutant fer-1 spermatozoa have defective mobility and show 
imperfect fusion of membranous organelles (Ward et al . , 
1981, J. Cell Bio. 91:26-44). Like fer-1, dysferlin is a 
large protein with an extensive, highly charged 
hydrophilic region and a single predicted membrane 

3 0 spanning region at the carboxy terminus (Fig. 3) . There 
is a membrane retention sequence 3' to the membrane 
spanning stretch, indicating that the protein may be 
preferentially targeted to either endoplasmic or 
sarcoplasmic reticulum, probably as a Type II protein 
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(i.e. with the NH 2 end and most of the following protein 
located within the cytoplasm) (Fig. 1C) . Several nuclear 
membrane targeting sequences are predicted within the 
cytoplasmic domain of the protein 
5 (http://psort.nibb.ac.jp/form.html) . Immunocytochemical 
detection of dysferlin suggests that dysferlin is 
targeted to or anchored within the sarcoplasmic 
reticulum. 

The cytoplasmic component of this protein contains 

10 four motifs homologous to C2 domains. C2 domains are 

intracellular protein modules composed of 80 - 130 amino 
acids (Rizo & Sudhof, 1998, J. Biol. Chem. 273:15897). 
Originally identified within a calcium- dependent isoform 
of protein kinase C (Nishizuka, 1988, Nature 334:661-65), 

15 C2 domains are present in numerous proteins. These 
domains often arise in approximately homologous pairs 
described as double C2 or D0C2 domains. One DOC2 
protein, DOC2o?, is brain specific and highly concentrated 
in synaptic vesicles (Orita et al . , 1995, Biochem. 

20 Biophys. Res. Comm. 206:439-48), while another, DOC2£, is 
ubiquitously expressed (Sakaguchi et al . , 1995, Biochem. 
Biophys. Res. Comm. 217:1053-61). Many C2 modules can 
fold to bind calcium, thereby initiating signaling events 
such as phospholipid binding. At distal nerve 

25 terminals, for example, the synaptic vesicle protein 
synaptotagmin has two C2 domains that, upon binding 
calcium, permit this protein to interact with syntaxin, 
triggering vesicle fusion with the distal membrane and 
neurotransmitter release (Sudhof & Rizo, 1996, Neuron 

30 17 :379-88) . 

The four dysferlin C2 domains are located at amino 
acid positions 32-82, 431-475, 1160-1241, and 1582-1660 
(Figs. 1C and 3). Indeed, it is almost exclusively 
through these regions that dysferlin has homology to any 

35 proteins other than fer-1. Each of these segments in 
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dysferlin is considerably smaller than a typical C2 
domain. Moreover, these segments are more widely 
separated in comparison with the paired C2 regions in 
synaptotagmin, DOC2a and /3 and related C2 -positive 
5 proteins. For this reason, it is difficult to predict 
whether the four relatively short C2 domains in dysferlin 
function analogously to conventional C2 modules. That 
dysferlin might, by analogy with synaptotagmin, signal 
events such as membrane fusion is suggested by the fact 

10 that fer-1 deficient worms show defective membrane 

organelle fusion within spermatozoa (Ward et al . , 1981, 
J. Cell Bio. 91:26-44). 

The invention will be further described in the 
following examples, which do not limit the scope of the 

15 invention described in the claims. 

EXAMPLES 

Example 1: Production of dysferlin protein 

Standard methods can be used to synthesize either 
wild type or mutant dysferlin, or fragments of either. 

20 These methods can also be used to synthesize brain- 
specific dysferlin polypeptides including full-length or 
fragments (e.g., a polypeptide unique to brain-specific 
dysferlin) . For example, a recombinant expression vector 
encoding dysferlin (or a fragment thereof: e.g., 

2 5 dysferlin minus its membrane -spanning region) operably 

linked to appropriate expression control sequences can be 
used to express dysferlin in a prokaryotic (e.g., E.coli) 
or eukaryotic host (e.g., insect cells, yeast cells, or 
mammalian cells) . The protein is then purified by 

30 standard techniques. If desired, DNA encoding part or 
all of the dysferlin sequence can be joined in- frame to 
DNA encoding a different polypeptide, to produce a 
chimeric DNA that encodes a hybrid polypeptide. This can 
be used, for example, to add a tag that will simplify 

35 identification or purification of the expressed protein, 
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or to render the dysferlin (or fragment thereof) more 
immunogenic . 

The preferred means for making short peptide 
fragments of dysferlin is by chemical synthesis. These 
5 fragments, like dysferlin itself, can be used to generate 
antibodies, or as positive controls for antibody-based 
assays . 

Fusion proteins are useful, e.g., for generating 
antibodies. Such fusion proteins are generated using 

10 known methods. In one example, to construct glutathione 
S- transferase (GST) : dysferlin fusion proteins, the BLAST 
program (Altschul et al . , 1990, J. Molec . Biol. 215:403- 
410) was used to identify three regions of the dysferlin 
cDNA that show no homology to any known human proteins 

15 (Figure 1) . These were subcloned from the dysferlin cDNA 
as BstYI (881-1333), XmnI (1990-2718) and Sail (5364- 
5732) fragments ligated respectively into BamHI, Smal and 
Sail sites of pGEX-5X-3 (Pharmacia) . The three fragments 
correspond to amino acid sequences at amino acid 

20 locations 253-403, 624-865, and 1664-1786 of SEQ ID NO : 2 , 
respectively. The resulting GST fusion proteins of BamHI 
(43 kDa) and Smal (53.3 kDa) formed isoluble aggregates 
that were isolated by SDS-PAGE. The fusion protein of 
Sail (40.2 kDa) was soluble and thus could be purified 

25 using a glutathione Sepharose 4B column; the Sail 

dysferlin fragment (14.2 kDa) was isolated by cleavage 
from GST using Factor Xa protease. The eluted protein 
was concentrated and further purified by SDS-PAGE. For 
all three of the fusion peptides, the resulting SDS-PAGE 

30 bands were excised and used to immunize rabbits. 

Example 2: Production and characterization of anti- 
dvsf erlin antibodies 

Techniques for generating both monoclonal and 
polyclonal antibodies specific for a particular protein 
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are well known. The antibodies can be raised against a 
short peptide epitope of dysferlin, an epitope linked to 
a known immunogen to enhance immunogenicity , a long 
fragment of dysferlin, or the intact protein. Antibodies 
5 can also be raised against brain-specific dysferlin 
polypeptides, e.g., against amino acids 1-24 of SEQ ID 
NO: 233. Such antibodies raised against dysferlin or 
brain-specific dysferlin polypeptides are useful for 
e.g., localizing such polypeptides in tissue sections or 

10 fractionated cell preparations and diagnosing dysferlin- 
related disorders. 

An isolated dysferlin protein, or a portion or 
fragment thereof, can be used as an immunogen to generate 
antibodies that bind dysferlin using standard techniques 

15 for polyclonal and monoclonal antibody preparation. The 
dysferlin immunogen can also be a mutant dysferlin or a 
fragment of a mutant dysferlin. A full-length dysferlin 
protein can be used or, alternatively, antigenic peptide 
fragments of dysferlin can be used as immunogens . The 

20 antigenic peptide of dysferlin comprises at least 8 

(preferably 10, 15, 20, or 30) amino acid residues of the 
amino acid sequence shown in SEQ ID NO: 2 and encompasses 
an epitope of such that an antibody raised against the 
peptide forms a specific immune complex with dysferlin. 

25 Preferred epitopes encompassed by the antigenic peptide 
are regions of dysferlin that are located on the surface 
of the protein, e.g., hydrophilic regions. 

A dysferlin immunogen typically is used to prepare 
antibodies by immunizing a suitable subject (e.g., 

3 0 rabbit, goat, mouse or other mammal) with the immunogen. 
An appropriate immunogenic preparation can contain, for 
example, recombinantly expressed dysferlin protein or a 
chemically synthesized dysferlin polypeptide. The 
preparation can further include an adjuvant, such as 

35 Freund's complete or incomplete adjuvant, or similar 
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immunostimulatory agent. Immunization of a suitable 
subject with an immunogenic dysferlin preparation induces 
a polyclonal anti-dysf erlin antibody response. 

Polyclonal anti-dysf erlin antibodies ("dysferlin 
5 antibodies") can be prepared as described above by 

immunizing a suitable subject with a dysferlin immunogen. 
The dysferlin antibody titer in the immunized subject can 
be monitored over time by standard techniques, such as 
with an enzyme linked immunosorbent assay (ELISA) using 

10 immobilized dysferlin. If desired, the antibody 

molecules directed against dysferlin can be isolated from 
the mammal (e.g., from the blood) and further purified by 
well-known techniques, such as protein A chromatography 
to obtain the IgG fraction. At an appropriate time after 

15 immunization, e.g., when the dysferlin antibody titers 
are highest, antibody-producing cells can be obtained 
from the subject and used to prepare monoclonal 
antibodies by standard techniques, such as the hybridoma 
technique originally described by Kohler and Milstein 

20 (1975) Nature 256:495-497 , the human B cell hybridoma 

technique (Kozbor et al . (1983) Immunol . Today 4:72), the 
EBV-hybridoma technique (Cole et al . (1985), Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96) or trioma techniques. The technology for 

25 producing hybridomas is well known (see generally Current 
Protocols in Immunology (1994) Coligan et al . (eds.) John 
Wiley 8c Sons, Inc., New York, NY). Briefly, an immortal 
cell line (typically a myeloma) is fused to lymphocytes 
(typically splenocytes) from a mammal immunized with a 

30 dysferlin immunogen as described above, and the culture 
supernatants of the resulting hybridoma cells are 
screened to identify a hybridoma producing a monoclonal 
antibody that binds dysferlin. 

Any of the many' well known protocols used for fusing 

35 lymphocytes and immortalized cell lines can be applied 
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for the purpose of generating a monoclonal antibody 
against dysferlin (see, e.g., Current Protocols in 
Immunology, supra; Galfre et al . (1977) Nature 266:55052; 
R.H. Kenneth, in Monoclonal Antibodies : A New Dimension 
5 In Biological Analyses, Plenum Publishing Corp., New 
York, New York (1980); and Lerner (1981) Yale J. Biol. 
Med., 54:387-402. Moreover, the one in the art will 
appreciate that there are many variations of such methods 
which also would be useful. Hybridoma cells producing a 

10 monoclonal antibody of the invention are detected by 
screening the hybridoma culture supernatants for 
antibodies that bind dysferlin, e.g., using a standard 
ELISA assay. 

Alternative to preparing monoclonal antibody- 

15 secreting hybridomas, a monoclonal dysferlin antibody can 
be identified and isolated by screening a recombinant 
combinatorial immunoglobulin library (e.g., an antibody 
phage display library) with dysferlin to thereby isolate 
immunoglobulin library members that bind dysferlin. Kits 

20 for generating and screening phage display libraries are 
commercially available (e.g., the Pharmacia Recombinant 
Phage Antibody System, Catalog No. 27-9400-01; and the 
Stratagene SurfZAP™ Phage Display Kit, Catalog No. 
240612) . Additionally, examples of methods and reagents 

25 particularly amenable for use in generating and screening 
antibody display library can be found in, for example, 
U.S. Patent No. 5,223,409; PCT Publication No. WO 
92/18619; PCT Publication No. WO 91/17271; PCT 
Publication No. WO 92/20791; PCT Publication No. WO 

30 92/15679; PCT Publication No. WO 93/01288; PCT 

Publication No. WO 92/01047; PCT Publication No. WO 
92/09690; PCT Publication No. WO 90/02809; Fuchs et al . 
(1991) Bio/Technology 9:1370-1372; Hay et al . (1992) Hum. 
Antibod. Hybridomas 3:81-85; Huse et al . (1989) Science 



BNSDOCID: <WO 0011157A1_L> 



WO 00/11157 



PCT/US99/19395 



- 44 - 

246:1275-1281; Griffiths et al . (1993) EMBOJ. 12:725- 
734. 

As an example, two polyclonal antisera were raised 
for each of the fusion peptide antigens described above 
5 using New Zealand White rabbits . The rabbits were 
injected with 0.5 mg of antigen using keyhole limpet 
hemocyanin (KLH) as the adjuvent. Booster injections of 
0.25 mg antigen were administered every three weeks over 
12 weeks. Serum was prepared from the rabbits and was 

10 purified using affinity column chromatography (HiTrap; 
Pharmacia) or antigen-blotted polyvinylidene difluoride 
( PVDF ) membrane . 

Immunoblotting was used to verify that the affinity- 
purified antisera recognize the cognate fusion peptides 

15 by Western immunoblotting (WIB) and that this reactivity 
was immunoadsorbed by pre -incubation of the antisera with 
the peptides. Thus, antiserum raised against the 
polypeptide encoded by the Sail fragment (encoding amino 
acids 1664-1786) identified the fragment both as a 

2 0 cleaved, 14.2 kDa fragment and as a component of the 4 0.2 

kDa GST-Sail fusion peptide. No reactivity was evident 
in the fraction containing only the GST fusion partner. 
Immunoadsorption entirely abolished this staining. 
Analogous results were detected with all six antisera (to 
25 the three different target fusion peptides) . 

Preparation of subcellular fractions 

Frozen human muscle (0.3 g) was homogenized in five 
volumes of 0.25 M sucrose containing proteinase inhibitor 
(Complete, Boehringer) . Subcellular fractions of nuclei, 

3 0 mitochondria, microsomes, and cytosol were separated by 

differential centrif ugat ion . The purity of each fraction 
was evaluated by immunoblotting of fraction-specific 
proteins with antibodies to histone HI (Calbiochem) , 
cytochrome c (Santa Cruz) , Na + -K + ATPase al subunit 
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(Research Diagnostics) and cytosolic superoxide dismutase 
(Calbiochem) . 



Dysferlin in subcellular fractions 
Immunoblotting was used to analyze dysferlin 
5 expression. Twenty fig of each subcellular fraction and 
4 0 /xg of whole homogenate of muscle were separated by 
SDS-PAGE (4-15% gradient gel) and transferred to a 
nitrocellulose membrane. Immunoblotting was performed 
according to standard methods, using chemiluminescence 

10 (ECL, Amersham) . Immunoblotting of multi-tissue blots 

identified prominent dysferlin positively at 
approximately 23 0 kDa in heart, placenta, skeletal muscle 
and kidney. Little or no immuno-posit ive staining was 
detected in brain, liver, spleen, ovary, or testis. 

15 Lower molecular weight bands (approximately 40 kDa) were 
also evident. Immunoadsorption with the corresponding 
fusion peptide abolished both the large and the smaller 
bands. The 230 kDa band was observed with all of the 
affinity purified, ant i-dysf erlin antisera. 

20 Immunoblotting of fractionated human muscle 

documented distinct 23 0 kDa bands in the whole muscle 
homogenate an in microsomal and nuclear fractions. Some 
immunoreactivity was also evident in the nuclear and 
mitochondrial fractions. No immunoreactivity was 

25 detected in the cytosolic fractions. This pattern was 
seen with all of the ant i -dysferlin antisera, and was 
eliminated by immunoadsorption. The identity of the 
assayed fractions was verified by Western blotting using 
fraction-specific antibodies: histone HI for the nuclear 

30 fraction, cytochrome c for the mitochondrial fraction, 

Na + -K + ATPase Qfl-subunit for the microsomal fraction, and 
S0D1 for the cytosolic fraction. 



Example 3 : Diagnosis 
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The discovery of mutations in the dysferlin gene 
that are associated with the MM and LMGD2B phenotypes 
means that individuals can be tested for the disease gene 
before symptoms appear. This will permit genetic testing 
5 and counseling of those with a family history of the 
disease. Additionally, individuals diagnosed with the 
genetic defect can be closely monitored for the 
appearance of symptoms, thereby permitting early 
intervention, including genetic therapy, as appropriate. 

10 Individuals with a brain-specific dysf erlin-related 
disorder can be diagnosed using such methods. 

Diagnosis can be carried out on any suitable genomic 
DNA sample from the individual to be tested. Typically, 
a blood sample from an adult or child, or a sample of 

15 placental or umbilical cord cells of a newborn would be 
used; alternatively, one could utilize a fetal sample 
obtained by amniocentesis or chorionic villi sampling. 

It is expected that standard genetic diagnostic 
methods can be used. For example, PCR can be utilized to 

20 identify the presence of a deletion, addition, or 

substitution of one or more nucleotides within any one of 
the exons of dysferlin. Following the PCR reaction, the 
PCR product can be analyzed by methods such as a 
heteroduplex detection technique based upon that of White 

25 et al . (1992, Genomics 12:301-06), or by techniques such 
as cleavage of RNA-DNA hybrids using RNase A (Myers et 
al . , 1985, Science 230 : 1242-46) , single-stranded 
conformation polymorphism (SSCP) analysis (Orita et al . , 
1989, Genomics 10:298-99), di-deoxy- fingerprinting (DDF) 

30 (Blaszyk et al . , 1995, Biotechniques 18: 256-260) and 
denaturing gradient gel electrophoresis (DGGE ; Myers et 
al . , 1987, Methods Enzymol . 155:501-27) . The PCR may be 
carried out using a primer which adds a G+C rich sequence 
(termed a n GC-clamp u ) to one end of the PCR product, thus 

3 5 improving the sensitivity of the subsequent DGGE 
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procedure (Sheffield et al . , 1989, Proc. Natl. Acad. Sci . 
USA 86:232-36). If the particular mutation present in 
the patient's family is known to have removed or added a 
restriction site, or to have significantly increased or 
5 decreased the length of a particular restriction 

fragment, a protocol based upon restriction fragment 
length polymorphism (RFLP) analysis (perhaps combined 
with PCR) may be appropriate. 

The apparent genetic heterogeneity resulting in the 

10 MM/L.GMD2B phenotypes means that the nature of the 

particular mutation carried by affected individuals in 
the patient's family may have to be ascertained prior to 
attempting genetic diagnosis of the patient. 
Alternatively, a battery of tests designed to identify 

15 any of several mutations known to result in MM/LGMD2B may 
be utilized to screen individuals without a defined 
familial genotype. The analysis can be carried out on 
any genomic DNA derived from the patient, typically from 
a blood sample. 

20 Instead of basing the diagnosis on analysis of the 

genomic DNA of a patient, one could seek evidence of the 
mutation in the level or nature of the relevant 
expression products. Well-known techniques for analyzing 
expression include mRNA-based methods, such as Northern 

25 blots and in situ hybridization (using a nucleic acid 
probe derived from the relevant cDNA) , and quantitative 
PCR (as described in St- Jacques et al . , 1994, 
Endocrinology 134:2645-57) . One could also employ 
polypeptide based methods, including the use of 

30 antibodies specific for the polypeptide of interest. 
These techniques permit quantitation of the amount of 
expression of a given gene in the tissue of interest, at 
least relative to positive and negative controls. One 
would expect an individual who is heterozygous for a 

35 genetic defect affecting the level of expression of 
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dysferlin to show up to a 50% loss of expression of this 
gene in such a hybridization or antibody-based assay. An 
antibody specific for the carboxy terminal end would be 
likely to pick up (by failure to bind to) most or all 
5 frameshift and premature termination signal mutations, as 
well as deletions of the carboxy terminal sequence. Use 
of a battery of monoclonal antibodies specific for 
different epitopes of dysferlin would be useful for 
rapidly screening cells to detect those expressing mutant 

10 forms of dysferlin (i.e., cells which bind to some 
dysf erlin- specif ic monoclonal antibodies, but not to 
others) , or for quantifying the level of dysferlin on the 
surface of cells. One could also use a protein 
truncation assay (Heim et al . , 1994, Nature Genetics 

15 8:218-19) to screen for any genetic defect which results 
in the production of a truncated polypeptide instead of 
the wild type protein. 

Use of immunodetection to identify normal and 
disease -associated dysferlin 

2 0 In the following example, immunodetection methods 

are used to demonstrate a detectable difference in 
muscles homogenates between normal and disease-associated 
dysferlin alleles. 

Frozen muscle samples (quadriceps) were homogenized 

2 5 in ten volumes of SDS-PAGE sample buffer and boiled for 5 

minutes. The final loading volume of SDS-PAGE was 
adjusted after densitometric measurements (NIH Image) of 
myosin heavy chain on the Coomassie blue stained gels. 
Studies were performed on six MM, two LGMD-2B, and three 

3 0 normal muscle samples. 

I mmunocy t ochemi s t ry was performed on 8 micron 
cryostat sections of the muscle that were fixed in 100% 
cold acetone for 5 minutes and preincubated with PBS 
containing 1% BSA, 5% heat -inactivated goat serum and 
35 0.2% Triton®X-100 . The sections were incubated with 
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primary antibodies overnight at 4°C and fluorescein- 
labeled secondary (TAGO Immunologicals) for 3 0 minutes at 
room temperature. The primary antibodies were applied in 
two double staining combinations: Sall-l anti-dysf erlin 
5 and ant i -dystrophin antibodies, and SalI-2 anti-dysf erlin 
and anti-6-sarcoglycan antibodies. The sections were 
mounted in SlowFade (Molecular Probes) . 

The 23 0 kDA antigen was absent in samples from all 
five MM patient in immunoblot assays. All five patients 

10 had normal patterns of dystrophin expression. Genetic 
analysis of the dysferlin gene in the patients predicted 
that at least two of the five MM patients should have no 
full-length protein. Two of the other three patients had 
mutations in at least one allele that are predicted to 

15 eliminate normal dysferlin expression. In all five 
patients, absence of dysferlin immuno- staining was 
documented with at least two other anti -dysferlin anti- 
sera . 

Immunostaining of dysferlin, dystrophin and 6- 

2 0 sarcoglycan proteins demonstrated distinct membrane- 

associated positivity for each protein in normal muscle. 
By contrast, in both MM and LGMD-2B muscle the dysferlin 
protein was absent, while the dystrophin and 6- 
sarcoglycan proteins appeared normal . 

25 Therapeutic Treatment 

A patient with MM/LGMD2B, or an individual 
genetically susceptible to contracting one or both of 
these diseases, can be treated by supplying dysferlin 
therapeutic agents of the present invention. Dysferlin 

3 0 therapeutic agents include a DNA or a subgenomic 

polynucleotide coding for a functional dysferlin protein. 
A DNA (e.g., a cDNA) is prepared which encodes the wild 
type form of the gene operably linked to expression 
control elements (e.g., promoter and enhancer) that 
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induce expression in skeletal muscle cells or any other 
affected cells* The DNA may be incorporated into a 
vector appropriate for transforming the cells, such as a 
retrovirus, adenovirus, or adeno-associated virus. One 
5 of the many other known types of techniques for 

introducing DNA into cells in vivo may be used (e.g., 
liposomes) . Particularly useful would be naked DNA 
techniques, since naked DNA is known to be readily taken 
up by skeletal muscle cells upon injection into muscle . 

10 Wildtype dysferlin protein can also be administered to an 
individual who either expresses mutant dysferlin protein 
or expresses an inadequate amount of dysferlin protein, 
e.gr., a MM/LGMD2B patient. 

Administration of the dysferlin therapeutic agents 

15 of the invention can include local or systemic 

administration, including injection, oral administration, 
particle gun, or catheterized administration, and topical 
administration. Various methods can be used to 
administer the therapeutic dysferlin composition directly 

20 to a specific site in the body. For example, a specific 
muscle can be located and the therapeutic dysferlin 
composition injected several times in several different 
locations within the body of the muscle. The 
therapeutic dysferlin composition can be directly 

25 administered to the surface of the muscle, for example, 

by topical application of the composition. X-ray imaging 
can be used to assist in certain of the above delivery 
methods. Combination therapeutic agents, including a 
dysferlin protein or polypeptide or a subgenomic 

30 dysferlin polynucleotide and other therapeutic agents, 
can be administered simultaneously or sequentially. 

Receptor-mediated targeted delivery of therapeutic 
compositions containing dysferlin subgenomic 
polynucleotides to specific tissues can also be used. 

35 Receptor-mediated DNA delivery techniques are described 
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in, for example, Findeis et al . (1993), Trends in 
Biotechnol. 11, 202-05; Chiou et al . (1994), Gene 
Therapeutics: Methods and Applications of Direct Gene 
Transfer (J. A. Wolff, ed.) ; Wu & Wu (1988), J. Biol. 
5 Chew. 263, 621-24; Wu et al . (1994), J . Biol. Chem. 269, 
542-46; Zenke et al . (1990), Proc . Natl. Acad. Sci . 
U.S.A. 87, 3655-59; Wu et al . (1991), J. Biol. Chem. 266, 
338-42 . 

Alternatively, a dysferlin therapeutic composition 

10 can be introduced into human cells ex vivo, and the cells 
then implanted into the human. Cells can be removed from 
a variety of locations including, for example, from a 
selected muscle. The removed cells can then be contacted 
with the dysferlin therapeutic composition utilizing any 

15 of the above-described techniques, followed by the return 
of the cells to the human, preferably to or within the 
vicinity of a muscle. The above-described methods can 
additionally comprise the steps of depleting fibroblasts 
or other contaminating non-muscle cells subsequent to 

2 0 removing muscle cells from a human. 

Both the dose of the dysferlin composition and the 
means of administration can be determined based on the 
specific qualities of the therapeutic composition, the 
condition, age, and weight of the patient, the 

25 progression of the disease, and other relevant factors. 
If the composition contains dysferlin protein or 
polypeptide, effective dosages of the composition are in 
the range of about 1 fig to about 100 mg/kg of patient 
body weight, e.g., about 50 ptg to about 5 0 mg/kg of 

30 patient body weight, e.g., about 500 /zg to about 5 mg/kg 
of patient body weight. 

Therapeutic compositions containing dysferlin 
subgenomic polynucleotides can be administered in a range 
of about 0.1 fxg to about 10 mg of DNA/dose for local 

35 administration in a gene therapy protocol. Concentration 
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ranges of about 0.1 fig to about 10 mg, e.g., about 1 fig 
to about 1 mg, e.g., about 10 fig to about 100 fig of DNA 
can also be used during a gene therapy protocol. Factors 
such as method of action and efficacy of transformation 
5 and expression are considerations that will effect the 
dosage required for ultimate efficacy of the dysferlin 
subgenomic polynucleotides. Where greater expression is 
desired over a larger area of tissue, larger amounts of 
dysferlin subgenomic polynucleotides or the same amounts 

10 readministered in a successive protocol of 

administrations, or several administrations to different 
adjacent or close tissue portions of for example, a 
muscle site, may be required to effect a positive 
therapeutic outcome. In all cases, routine 

15 experimentation in clinical trials will determine 
specific ranges for optimal therapeutic effect. 

Animal Model 

A line of transgenic animals (e.g., mice, rats, 
guinea pigs, hamsters, rabbits, or other mammals) can be 

2 0 produced bearing a transgene encoding a defective form of 

dysferlin. Standard methods of generating such 
transgenic animals would be used, e.g., as described 
below. 

Alternatively, standard methods of producing null 
25 (i.e., knockout) mice could be used to generate a mouse 
which bears one defective and one wild type allele 
encoding dysferlin. If desired, two such heterozygous 
mice could be crossed to produce offspring which are 
homozygous for the mutant allele. The homozygous mutant 

3 0 offspring would be expected to have a phenotype 

comparable to the human MM and/or LGMD2B phenotype, and 
so serve as models for the human disease. 

For example, in one embodiment, dysferlin mutations 
are introduced into a dysferlin gene of a cell, e.g., a 
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fertilized oocyte or an embryonic stem cell. Such cells 
can then be used to create non-human transgenic animals 
in which exogenous altered (e.g., mutated) dysferlin 
sequences have been introduced into their genome or 
5 homologously recombinant animals in which endogenous 

dysferlin nucleic acid sequences have been altered. Such 
animals are useful for studying the function and/or 
activity of dysferlin and for identifying and/or 
evaluating modulators of dysferlin function. As used 

10 herein, a "transgenic animal 11 is a non-human animal, 

preferably a mammal, more preferably a rodent such as a 
rat or mouse, in which one or more of the cells of the 
animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, 

15 dogs, cows, goats, chickens, amphibians, etc. A 

transgene is exogenous DNA which is integrated into the 
genome of a cell from which a transgenic animal develops 
and which remains in the genome of the mature animal, 
thereby directing the expression of an encoded gene 

2 0 product in one or more cell types or tissues of the 
transgenic animal. As used herein, an "homologously 
recombinant animal" is a non-human animal, preferably a 
mammal, more preferably a mouse, in which an endogenous 
dysferlin gene has been altered by homologous 

2 5 recombination between the endogenous gene and an 

exogenous DNA molecule introduced into a cell of the 
animal, e.g., an embryonic cell of the animal, prior to 
completed development of the animal. 

A transgenic animal of the invention can be created 

3 0 by introducing a nucleic acid encoding a dysferlin 

mutation into the male pronuclei of a fertilized oocyte, 
e.g., by microinjection or retroviral infection, and 
allowing the oocyte to develop in a pseudopregnant female 
foster animal. A dysferlin cDNA sequence e.g., that of 
3 5 (SEQ ID NO:l or SEQ ID NO : 3 ) can be introduced as a 
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transgene into the genome of a non-human animal . 
Alternatively, a nonhuman homologue of the human 
dysferlin gene can be isolated based on hybridization to 
the human dysferlin sequence (e.g., cDNA) and used as a 
5 transgene. Intronic sequences and polyadenylation 

signals can also be included in the transgene to increase 
the efficiency of expression of the transgene. Methods 
for generating transgenic animals via embryo manipulation 
and microinjection, particularly animals such as mice, 

10 have become conventional in the art and are described, 
for example, in U.S. Patent Nos . 4,736,866 and 
4,870,009, U.S. Patent No. 4,873,191 and in Hogan, 
Manipulating the Mouse Embryo, (Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y., 1986). 

15 Similar methods are used for production of other 

transgenic animals. A transgenic founder animal can be 
identified based upon the presence of the mutant 
dysferlin transgene in its genome and/or expression of 
the mutant dysferlin mRNA in tissues or cells of the 

20 animals. A transgenic founder animal can then be used to 
breed additional animals carrying the transgene. 
Moreover, transgenic animals carrying a transgene 
encoding a mutant dysferlin can further be bred to other 
transgenic animals carrying other transgenes. 

25 To create an homologously recombinant animal, a 

vector is prepared which contains at least a portion of a 
dysferlin gene into which a deletion, addition or 
substitution has been introduced to thereby alter a 
dysferlin gene. In a preferred embodiment, the vector is 

30 designed such that, upon homologous recombination, the 
endogenous dysferlin gene is functionally disrupted 
(i.e., no longer encodes a functional protein; also 
referred to as a "knock out" vector) . Alternatively, the 
vector can be designed such that, upon homologous 

35 recombination, the endogenous dysferlin gene is mutated 
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or otherwise altered (e.g., contains one of the mutations 
described in Table 2) . In the homologous recombination 
vector, the altered portion of the dysferlin sequence is 
flanked at its 5' and 3' ends by additional nucleic acid 
5 of the dysferlin gene to allow for homologous 

recombination to occur between the exogenous dysferlin 
nucleic acid sequence carried by the vector and an 
endogenous dysferlin gene in an embryonic stem cell. The 
additional flanking dysferlin nucleic acid is of 

10 sufficient length for successful homologous recombination 
with the endogenous gene. Typically, several kilobases 
of flanking DNA (both at the 5' and 3' ends) are included 
in the vector {see, e.g., Thomas and Capecchi (1987) Cell 
51:503 for a description of homologous recombination 

15 vectors) . The vector is introduced into an embryonic 
stem cell line (e.g., by electroporation) and cells in 
which the introduced dysferlin sequence has homologously 
recombined with the endogenous dysferlin gene are 
selected (see, e.g., Li et al . (1992) Cell 69:915). The 

20 selected cells are then injected into a blastocyst of an 
animal (e.g., a mouse) to form aggregation chimeras (see, 
e.g., Bradley in Teratocarcinomas and Embryonic Stem 
Cells: A Practical Approach, Robertson, ed . (IRL, Oxford, 
1987) pp. 113-152) . A chimeric embryo can then be 

25 implanted into a suitable pseudopregnant female foster 

animal and the embryo brought to term. Progeny harboring 
the homologously recombined DNA in their germ cells can 
t> e used to breed animals in which all cells of the animal 
contain the homologously recombined DNA by germline 

30 transmission of the transgene. Methods for constructing 
homologous recombination vectors and homologous 
recombinant animals are described further in Bradley 
(1991) Current Opinion in Bio/Technology 2:823-829 and in 
PCT Publication Nos . WO 90/11354, WO 91/01140, WO 

35 92/0968, and WO 93/04169. 
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Other Embodiments 
It is to be understood that while the invention has 
been described in conjunction with the detailed 
description thereof, the foregoing description is 
intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appended 
claims. Other aspects, advantages, and modifications are 
within the scope of the following claims. 
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What is claimed is: 

1. An isolated DNA comprising a nucleotide sequence 
which hybridizes under stringent hybridization conditions 
to SEQ ID NO: 3, or a complement thereof. 

5 2. The isolated DNA of claim 1, wherein the 

nucleotide sequence is SEQ ID NO : 117. 

3. An isolated DNA comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NOs:4-12. 

4. The isolated DNA of claim 3, comprising the 

10 sequence of SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ 
ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ 
ID NO: 20, or SEQ ID NO: 21. 



5. An isolated DNA comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NOS: 22-30. 

15 6. A single stranded oligonucleotide of 14-50 

nucleotides in length having a nucleotide sequence 
identical to a portion of SEQ ID NO: 3, or a complement 
thereof . 



7. A pair of PCR primers consisting of: 
20 (a) a first single stranded oligonucleotide 

consisting of 14-50 contiguous nucleotides that are 
identical to a portion of SEQ ID NO: 117; and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides that are 
25 identical to a portion of SEQ ID NO: 117, wherein the 
sequence of at least one of the oligonucleotides is 
identical to a portion of a strand of SEQ ID NO : 3 , and 
the first oligonucleotide is not complement airy to the 
second oligonucleotide. 
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8. A pair of single-stranded oligonucleotides, 
wherein both oligonucleotides are selected from the group 
consisting of SEQ ID NOS:130-231, SEQ ID NO:110, and SEQ 
ID NO: 112 and the oligonucleotides are different from 

5 each other. 

9. An isolated DNA comprising a nucleotide sequence 
that encodes a polypeptide that shares at least 70% 
sequence identity with SEQ ID NO:2, or a complement of 
the nucleotide sequence. 

10 10. The isolated DNA of claim 9, wherein the 

polypeptide comprises the sequence of SEQ ID NO:2. 

11. An isolated DNA comprising a nucleotide 
sequence which hybridizes under stringent hybridization 
conditions to a nucleic acid having a sequence selected 

15 from the group consisting of SEQ ID NOs: 31-79 and 90-100. 

12. A single stranded oligonucleotide of 14-50 
nucleotides in length comprising a nucleotide sequence 
which is identical to a portion of a nucleic acid 
selected from the group consisting of SEQ ID NOs: 31-79 

20 and 90-100, or a complement of the nucleotide sequence. 

13. The oligonucleotide of claim 12, wherein the 
portion includes an intronic sequence. 

14. A pair of PCR primers consisting of: 
(a) a first single-stranded oligonucleotide 

25 consisting of 14-50 contiguous nucleotides that are 
identical to a portion of a sense strand of a nucleic 
acid selected from the group consisting of SEQ ID NOs : 31- 
85; and 
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(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides that are 
identical to a portion of the antisense strand of a 
nucleic acid selected from the group consisting of SEQ ID 
5 NOs 131-85, wherein the sequence of at least one of the 
oligonucleotides comprises a sequence identical to a 
portion of a nucleic acid selected from SEQ ID NOs : 31-79 
and 90-100, and wherein the first oligonucleotide is not 
complementary to the second oligonucleotide. 

10 15. A pair of single-stranded oligonucleotides 

selected from the group consisting of SEQ ID NOs : 101-116 , 
SEQ ID NOs:184-185, SEQ ID NOs:188-191, SEQ ID NOs:210- 
213, and SEQ ID NOs:216-217. 

16. A vector comprising the isolated DNA of claim 

15 1 . 

17. A substantially pure polypeptide comprising an 
amino acid sequence sharing at least 70% sequence 
identity with SEQ ID NO : 2 . 

18. The substantially pure polypeptide of claim 17, 
20 wherein the polypeptide comprises an amino acid sequence 

identical to that of a naturally occurring polypeptide. 

19. The substantially pure polypeptide of claim 18, 
wherein the amino acid sequence comprises the sequence of 
SEQ ID N0:2. 

25 2 0. A substantially pure polypeptide comprising an 

amino acid sequence identical to the amino acid sequence 
of amino acid residues 1-500, 501-1000, 1001-1500, or 
1501-2080 of SEQ ID NO : 2 . 
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21. A substantially pure polypeptide comprising the 
amino acid sequence of SEQ ID NO: 86, SEQ ID NO: 87, SEQ ID 
NO: 88 or SEQ ID NO: 89. 

22. A substantially pure polypeptide selected from 
5 the group consisting of amino acids 253-4 03 of SEQ ID 

NO: 2, amino acids 624-865 of SEQ ID NO: 2, and amino acids 
1664-1786 of SEQ ID NO : 2 . 

23 . A fusion protein comprising a polypeptide of 
claim 22 . 

10 24. An antibody that specifically binds to the 

polypeptide of claim 22. 

25. An antibody that binds specifically to the 
polypeptide of claim 17. 

26. A cell comprising the isolated DNA of claim 1. 

15 27. A non-human mammal, the genomic DNA of which 

bears a transgene, wherein the transgene comprises the 
isolated DNA of claim 1. 

28. A transgenic non-human mammal having a 
transgene disrupting or interfering with the expression 

20 of a dysferlin gene. 

29. A method of decreasing the symptoms of muscular 
dystrophy in a mammal, the method comprising introducing 
into a cell of said mammal the isolated DNA of claim 1. 

30. A method of decreasing the symptoms of muscular 
25 dystrophy in a mammal, the method comprising introducing 
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into a cell of said mammal the vector of claim 16, the 
vector being an expression vector. 

31. A method of decreasing the symptoms of muscular 
dystrophy in a mammal, the method comprising introducing 

5 into a cell of said mammal the protein of claim 17. 

32. A method for identifying a patient, a fetus, or 
a pre-embryo at risk for having a dysf erlin-related 
disorder, the method comprising: 

(a) obtaining a sample of genomic DNA from the 
10 patient, fetus, or pre-embryo; and 

(b) determining whether the sample contains a 
mutation in a dysferlin gene, wherein a patient, a fetus, 
or a pre-embryo having a mutation in a dysferlin gene is 
at risk for having a dysf erlin-related disorder. 

15 33. The method of claim 32, comprising: 

(a) treating the sample of genomic DNA with a 
restriction enzyme specific for a particular restriction 
enzyme site; and 

(b) detecting the presence or absence of the 
20 particular restriction enzyme site in the sample of 

genomic DNA as an indication of the presence or absence 
of a particular mutation in the genomic DNA. 

34. The method of claim 33, wherein the restriction 
enzyme is selected from the group consisting of Pst I, 

25 Fnu4H I, BamH I, BstY I, Ava II, HinP I, Fsp I, Mbo II, 
ScrF I, BstN I, Mae I, Bfa I, Dde I, Bpm I, Ban II, Ava 
II, and Sau96 I. 

35. The method of claim 32, comprising subjecting 
the sample to polymerase chain reaction (PCR) . 
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36. The method of claim 32, comprising: 

(a) contacting a single stranded oligonucleotide 
with the sample of genomic DNA; and 

(c) detecting hybridization or lack thereof between 
5 the single stranded oligonucleotide and the genomic DNA, 
as an indication of the presence or absence of a mutation 
in the genomic DNA. 

37. A method for identifying a patient, a fetus, or 
a pre-embryo at risk for having a dysf erlin-related 
disorder, said method comprising: 

(a) providing a sample comprising dysferlin mRNA 
from the patient, fetus, or pre-embryo; and 

(b) determining whether the dysferlin mRNA contains 
a mutation, wherein a patient, a fetus, or a pre-embryo 
having a dysferlin mRNA containing a mutation is at risk 
for having a dysf erlin-related disorder. 

38. The method of claim 37, wherein the presence or 
absence of the mutation is detected by Northern blot. 

39. The method of claim 37, wherein the method 

20 includes the step of subjecting the sample to polymerase 
chain reaction (PCR) . 

40. A method for detecting the absence of a 
mutation in a dysferlin protein of a patient, a fetus, or 
a pre-embryo, the method comprising: 

(a) providing a sample comprising a dysferlin 
protein of the patient, fetus, or pre-embryo; 

(b) contacting the sample with the antibody of 
claim 22; and 

(c) detecting binding of the antibody to dysferlin 
protein in the sample, if any, wherein binding indicates 
a normal dysferlin protein. 



10 



15 



25 



30 
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41. An isolated DNA comprising a nucleotide sequence 
that is identical to the sequence of amino acid residues 
3501-3520 of SEQ ID NO:l, 3737-3756 of SEQ ID NO : 1 , 3842- 
3861 of SEQ ID NO : 1 , 5114-5139 of SEQ ID NO : 1 , or 5239- 

5 5255 of SEQ ID NO : 1 . 

42 . An isolated DNA comprising a nucleotide 
sequence selected from the group consisting of 

3501-3 5.20 of SEQ ID NO : 1 , wherein nucleotide G at 
3510 is A; 

10 3737-3756 of SEQ ID NO : 1 , wherein nucleotide G at 

3746 is deleted; 

3842-3861 of SEQ ID NO : 1 , wherein nucleotide C at 
3851 is T; 

5114-5139 of SEQ ID NO:l, wherein nucleotide C at 
15 5122 and nucleotide A at 5123 are deleted; 

5239-5255 of SEQ ID NO : 1 , wherein nucleotide G at 
5245 is deleted and nucleotide G at 5249 is C; and 

5239-5255 of SEQ ID NO : 1 , wherein nucleotide G at 
5245 is C and nucleotide G at 5249 is deleted. 

20 43. An isolated nucleic acid comprising a 

nucleotide sequence which hybridizes under stringent 
hybridization conditions to nucleic acids 3284-3720 of 
SEQ ID NO: 232, or the complement of said nucleotide 
sequence . 

25 44. An isolated nucleic acid comprising a 

nucleotide sequence identical to the sequence of 
nucleotides 3284-3720 of SEQ ID NO: 232, or a complement 
of said nucleotide sequence. 

45. The isolated nucleic acid of claim 44, wherein 
3 0 the nucleotide sequence comprises the sequence of SEQ ID 
NO: 232 or the complement of SEQ ID NO: 232. 
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46. An isolated polypeptide comprising: 

a) at least 15 contiguous amino acids of the 
polypeptide comprising amino acids 1-24 of SEQ ID NO: 233, 

b) a naturally occuring allelic variant of a 

5 polypeptide comprising amino acids 1-24 of SEQ ID NO: 233, 
or 

c) an amino acid sequence which is encoded by a 
nucleic acid molecule which hybridizes under stringent 
conditions to nucleotides 3284-3720 of SEQ ID NO:232. 

10 47. The polypeptide of claim 46, wherein the 

polypeptide comprises SEQ ID NO:233. 

48. A vector comprising the nucleic acid of claim 

44 . 

49. A cell comprising the vector of claim 48. 

15 50 . A method of making a polypeptide, the method 

comprising culturing the cell of claim 49. 

51. An antibody which specifically binds to a 
polypeptide of claim 46. 

52. The antibody of claim 51, wherein the antibody 
2 0 binds to a polypeptide selected from the group comprising 

amino acids 253-403 of SEQ ID NO: 233, amino acids 624-865 
of SEQ ID NO: 233, and amino acids 1664-1786 of SEQ ID 
NO: 233 . 

53. The antibody of claim 51, wherein the antibody 
2 5 is a monclonal antibody. 

54. The antibody of claim 51, wherein the antibody 
is a polyclonal antibody. 
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1 MEEZBILYAE NVHTPDTDIS DAYCSAVFAG VKKRTKVTVM SVNPyWMEG F 
51 ETOI,KffTPT,n MSELHYWK DHETMfiRITRF LfiSKJiS 5S 
101 FNAPLLDTKK QPTGASLVLQ VSYTPLPGAV PLFPPPTPLE PSPTLPDLDV 
151 VACTGGEEDT EDQGLTGDEA EPFLDQSGGP GAPTTPRKLP SRPPPhSgi 
201 E5BRSAPTSR KLLSDKPQDF QIRVQVIEGR QLPGVNIKPV VKVTAAGOTK 
251 RTRIHKGNSP LFNETLFFNL FDSPGELFDE PIFITWDSR SLRTDALLGE 
301 FRMDVGTIYR EPRHAYLRKW LLLSDPDDFS AGARGYLKTS LCVLciSS 
3 51 PLERKDPSED KEDIESNLLR PTGVALRGAH FCLKVFRAED LPWDDAvS 
401 NVKQIFGFES NKKNLVDPFV EVSFAGKMLC SKTT.PKtimp qBSSSS? 
451 KFPSMCEKMR IRIIDWPRr.T mDIVATTYL SMSKISAPGG EIEEEpIgI? 
SOI KPSKASDLDD YLGFLPTFGP CYINLYGSPR EFTGFPDPYT EuSSg'eg'v 
551 AYRGRIXLSL ETKLVEHSEQ KVEDLPADDI LRVEKYLfifiB. EYSLFAAFYS 
601 ATMLQDVDDA IQFEVSIGNY GNKFDMTCLP LASTTQYSRA VFDGCHYYYL 
651 PWGNVKPVW LSSYWEDISH RIETQNQLLG IADRLEAGLE QVHLALKAOC 
701 STEDVDSLVA QLTDELIAGC SQPLGDIHET PSATHLDQYL YQLRThSsQ 
751 ITEAALALKL GHSELPAALE QAEDWLLRLR ALAEEPQNSL PMVIWMLQG 
801 DKRVAYQRVP AHQVLFSRRG ANYCGKNCGK LQTIFLKYPM EKVPGA^PV 
851 QIRVKLWFGL SVDEKEFNQF AEGKLSVFAE TYENETKLAL VGNWGTTGLT 
901 YPKFSDVTGK IKLPKDSFRP SAGWTWAGDW FVCPEKTLLH DMDAGHLSFV 
9S1 EEVFENQTRL PGGQWIYMSD MYTDVNGEKV LPKDDIECPL OOMEDEEMS 
1001 TDLNRAVDEQ GWEY5ITI££ E55EEHWVPA EKKYY^S B^SSSl 
1051 SQMZALESm QAEAEGEGWE YASLFGWKFH LEYRKTDAFJJ eSSSeP 
1101 LEKTG PAAVF ALEGALGGVM DDKS ED SMS V STLSFGVNR? StciFDYGN 

1201 OTLIFYEIET FCEPftTYftF.Q PPSTWFT.vn mrYt;knP«i fiRCICQPSLE 
1251 RMPRLAWFPL TRGSQPSGEL LASFEL1QRE KPAIHHIPGF EVQETSR1LD 
}IH ^™ L?YP PP0REANIYM VPONIKPALQ RTAIEILAWG LRNMKSYQLA 
1351 NISSPSLWE CGGQTVQSCV IRNLRKNPNF D1CTLFMEVM LPREELYCPP 
1401 ITVKVIDNRQ FGRRPWGQC TIRSLESFLC DPYSAESPSP QGGPDDVSLL 
1451 SPGEDVLIDI DDKEPLIPIQ EEEFIDWWSK FFASIGEREK CGSYLEKDFD 
1501 TLKVYDTQLE NVEAFEGLSD FCNTFKLYRG KTQEETEDPS VIGEFKGLFK 
1551 IYPLPEDPAI PMPPRQFHQL AAQGPQECLV RIYJTVRAFGL DEKDPNGKCD 
1601 PYIKISTfiKK SVSPODNYTP ctl^ p ggg.gt ggggjS 
1651 DLLSKPEKIfi ETWDLENRL LSKFGARCGL PQTYCVSGPN QWRDQLRPSO 
1701 LLHLFCQQHR VKAPVYRTDR VMFQDKEYSI EEIEAGRIPN PHLGPVEERL 
tin, ^™° QQGL VPEHVESRP ^ YSPLOPDIEQ GKLQMWVDLF PKALGRPGPP 
1801 FNITPSSARR EFLRCIIWMT RDVILDDLSL TGEKMSDIYV KGWMIGFEEH 
1851 KQKTDVHYRS LGGEGNFNWR FIFPFDYLPA EQVCTIAKKD AFWRLDKTES 
1901 KIPARWFQI WDNDKFSFDD FLGSLQLDLN RMPKPAKTAK KCSLDQLDDA 
1951 FHPEWFVSLF EOKTVKGWWP CVAEEGEKKI LAGKLEMTLE IVAESEHEER 
2001 PAGQGRDEPN MNPKLEDPRR PDTSFLWFTS PYKTMKFILW RRFRWAIILF 
2051 XXZ.FILIAFL AIF1YAFPNY AAMKIffiKEBS (S£ Q te *t> . ^ 
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in 31/11 61/21 

TTT TOG TTC AAG CGA TIC TCT GGC CTC AGC CTC CCG AGT AGC TGG GAT TAC AGG CAT GCT CCA OCA AGC CCG GGT AAT TTT GTA TTT TTA 
si p KRFSGLSLPSSWDYRHAPPSPGNFVFL 
91/31 121/41 151/51 

ATA GAG ACG GGG TTT TGC CAT GTT GGT CAG GCT GGT CTC GAA CTC CTG AOC TCA GGT GAT CTG CCC ACC TTG GCC TCC CAA OCT GCT GAG 
j ETGFC HVGQAGLELLTSGDLPTLASQRAE 
181/61 211/71 241/81 

vrr aoa OCT ATG ACT CAC TGT GCC COG CAG AGA TGG TCT AAT TCA TAT GAA AGA ACT CTG AAA AAA GTA GAA AGT GAT TTT CTA AAA TAA 
ITGM SHCARQRWSNSYERTLKKVESDFLK* 
2 71/91 301/101 331/111 

«3T ACA AAT AAT TAA TGT AAG CAT AAT CAC CTA ACC TTG TGG AAT TTT TTT TTT TTG AGA AGC AAA TTG CAA ATT TGT GAT AGA TCT AAA 
G TNN • CKHNHLTLWNFFFLRSKLQ ICDRSK 
361/121 391/131 421/141 

GGA GAT TGA CTA AGA GGG TGA CCA TCT GGA AAT GAC GTC ATG TGA GAA TGG TTA AAG ATG CTC GGG AGA TTG AGC CTA GAG AAA GGA AGA 
G D * L R G * PSGNDVM* EWLKMLGRLSLEKGR 
4S1/151 ~ 481/161 511/171 

TTT GTG AAC CCA GGA GGC AGA GGT AGA GAT CCA GGA GAG ggc ggc gtg atg gat gac aag agt gaa gat tec atg tec gtc tec ace ttg 
FVNPGGRGRDPGEGGVMDD KSEDSMSVSTL 
541/181 571/191 601/201 

ttc cat ata aac aga ccc acg att tec tgc ata ttc gac tat ggg aac cgc tac cat eta cgc tgc tac atg tac cag gee egg gac 
SF GV NRPTISCIFDYGNRYHLRCYMYQARD 
631/ 2n 661/221 691/231 

eta act aca ata aac aag gac tct ttt tct gat ccc tat gec ate gtc tec ttc ctg cac cag age cag aag acg gtg gtg gtg aag aac 
L A AM D K D S F S O P Y A I V S P L H Q S Q K T V V V X N 
721/241 751/251 781/261 

acc ctt aac ccc acc tgg gac cag acg etc ate ttc tac gag ate gag ate ttt ggc gag ccg gee aca gtt get gag caa ccg ccc age 
T LNPTWDQTLIFYEIEIFGEPATVAEQPPS 
811/2 7i 841/281 871/291 



att ata ata aaa ctg tac gac cat gac act tat ggt gca gac gag ttt atg ggt cgc tgc ate tgt caa ccg agt ctg gaa egg atg cca 
I V V E L YDHDTYGADEFMGRCICQPSLERMP 
901/301 931/311 961/321 

caa eta acc tag ttc cca ctg acg agg ggc age cag ccg teg ggg gag ctg ctg gee tct ttt gag etc ate cag aga gag aag ccg gee 
R L AW FPLTRGSQPSGELLASFELIQREKPA 
991/ 331 1021/341 1051/351 

ate cac cat att cct ggt ttt gag gtg cag gag aca tea agg ate ctg gat gag tct gag gac aca gac ctg ccc tac cca cca ccc cag 
jHHIPGFEVQETSR.ILDESEDTDLPYPPPQ 
1081/361 1111/371 1141/381 

aoa aaa see aac ate tac atg gtt cct cag aac ate aag cca gcg etc cag cgt acc gee ate gag ate ctg gca tgg ggc ctg egg aac 
REAM I YMV PQNI K PA LQRTAI E I LAWG L R N 
1171/391 1201/401 1231/411 

«ta aaa aat tac caa eta acc aac ate tec tec ccc age etc gtg gta gag tgt ggg ggc cag acg gtg cag tec tgt gtc ate agg aac 
5 KS Y Q I A H I 8 S P S L V V B C G C Q T V Q S C V I R N 
1261/421 1291/431 1321/441 

etc cog aag aac ccc aac ttt gac ate tgc acc etc ttc atg gaa gtg atg ctg ccc agg gag gag etc tac tgc ccc ccc ate acc gtc 
LRKMPNFDICTLFMEVMLPREELYCPPITV 
1351/451 1381/461 1411/471 

aaa ate ate gat aac cgc cag ttt ggc cgc egg cct gtg gtg ggc cag tgt acc ate cgc tec ctg gag age ttc ctg tgt gac ccc tac 
KVIDNRQFGRRPVVGQCTIRSLESFLCDPY 

1441/481 1471/491 1501/501 

tea aca aaa aat cca tec cca cag ggt ggc cca gac gat gtg age eta etc agt cct ggg gaa gac gtg etc ate gac att gat gac aag 
S A E S PSPQGGPDDVSLLSPGEDVLIDIDDK 

1531/511 1561/521 1591/531 

aaa ccc etc ate ccc ate cag gag gaa gag ttc ate gat tgg tgg age aaa ttc ttt gee tec ata ggg gag agg gaa aag tgc ggc tec 
E PLI PIQEEEFIDWWSKFFASIGEREKCGS 
1621/541 1651/551 1681/561 

tac cto aaa aaa aat ttt gac acc ctg aag gtc tat gac aca cag ctg gag aat gtg gag gee ttt gag ggc ctg tct gac ttt tgt aac 
Y L E KDFDTLKVYOTQ LENVEAFEG LSDFCN 
1711/571 1741/581 1771/591 

acc ttc aaa eta tac egg ggc aag acg cag gag gag aca gaa gat cca tct gtg att ggt gaa ttt aag ggc etc ttc aaa att tat ccc 
T FK L YR GKTQBETEDPSVIGEFKGLFKIYP 

1801/601 1831/611 1861/621 

etc cca aaa aac cca gee ate ccc atg ccc cca aga cag ttc cac cag ctg gee gee cag gga ccc cag gag tgc ctg gtc cgt ate tac 

l p To p a i p m p p r o f k o l a a o g p q e c l v r i y 

1891/631 1921/641 1951/651 

att gtc cga gca ttt ggc ctg cag ccc aag gac ccc aat gga aag tgt gat cct tac ate aag ate tec ata ggg aag aaa tea gtg agt 
IVRAFGLQPKDPNGKCDPYIKIS IGKKSVS 
1981/661 2011/671 2041/681 

gac cag gat aac tac ate ccc tgc acg ctg gag ccc gta ttt gga aag atg ttc gag ctg acc tgc act ctg cct ctg gag aag gac eta 
DQDNYIPCTLEPVFGKMFELTCTLPLEKDL 
2071/691 2101/701 2131/711 

aag ate act etc tat gac tat gac etc etc tec aag gac gaa aag ate ggt gag acg gtc gtc gac ctg gag aac agg ctg ctg tec aag 
K ITLYDYDLLSKDEKIGETVVDLENRLLSK 

2161/721 2191/731 2221/741 

ttt ggg get cgc tgt gga etc cca cag acc tac tgt gtc tct gga ccg aac cag tgg egg gac cag etc cgc ccc tec cag etc etc cac 
FG ARCGLPQTYCVSGPNQWRDQLRPSOLLH 
2251/751 2281/761 2311/771 

etc ttc tgc cag cag cat aga gtc aag gca cct gtg tac egg aca gac cgt gta atg ttt cag gat aaa gaa tat tec att gaa gag ata 
LFCQQHRVKAPVYRTDRVMFQDKEYSI EEI 
2341/781 2371/791 2401/801 

aag get ggc agg ate cca aac cca cac ctg ggc cca gtg gag gag cgt ctg get ctg cat gtg ctt cag cag cag ggc ctg gtc ccg gag 
EAGRIPNPHLGPVEERLALHVLQQQGLVPE 

Figure 6A 
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22 l £ X « «- C 9 ccc c t c ccc P^cc. ate p. cag ggg «, p cc. ttt ccg r 

" V o„<, E S 2551/851 2581/861 

""Zcgg ccc gga cct ccc tec aac ace ace cc. egg .g. gec .g. .gg etc ttc ctg eg. tgt ate ate tgg aat acc ag. gat gtg 
LGRPGP 2641/881 2671/891 

atc^t J- ctg age etc acg ggg gag aag -tg age g.e att tat gtg -a. ggt tgg atg att ggc ttt gaa ga. c.c -ag ca. ..g 

1 L « rt , D 2731/911 2761/921 

cat .t cgt tec ctg gg. M t ga. ggc ..c tec aac tgg -gg ccc att etc ccc ccc gac tac ctg cca get gag c» *e 

T 2821/941 2851/951 

Cgt^-tt gec aag aag gat gee ttc tgg agg ctg gac aag act gag age -aa ate cca go. eg. gtg gtg ttc cag ate tgg gac aat 
C T a <, X 2911/971 2941/981 

gac^eec tec ttt gat gat ttt ctg ggc tec ctg cag etc gat etc aac cgc atg ccc aag cc. gec aag ac. gee jag aag Cgc tec 

° K oo/ S 3001/1001 3031/1011 

ctg gat gat get ttc cac cc. ga. tgg ttt gtg tec etc ttt g.g cag aa. -ca gtg aag ggc tgg tgg ccc cgc gc. gca 

L ",„■>? L 3091/1031 3121/1041 

g-gag^t gag aag -aa ata ctg gcg ggc aag ctg fl a- atg -cc ttg gag att gta gca gag age g^g eat gag gag egg cct get ggc 

nsi/ios? " 3181/1061 3211/1071 

cag ggc eg 
Q G R 
3241/1081 
acc acg aa 
T M K 
3331/1111 
ate tac gc 
I Y A 
3421/1141 
cct cca gc 
P P A 
3511/1171 



T D R 
3601/1201 



gat 
D 


gag 

E 


ccc 
P 


aac 
N 


atg 
M 


aac 

N 


cct 
P 


aag ctt gag 
K L E 


gac 
D 


cca 
P 


agg 
R 


cgc 
R 


ccc 

p 


gac acc 
D T 


tec ttc ctg 
S F L 










3271/1091 
















3301/1101 


ttc 


ate 


ctg 


tgg 


egg 


cgt 


ttc 


egg tgg gee 


ate 


ate 


ctc 


ttc 


ate 


ate 


ctc 


ttc ate ctg 


F 


I 


L 


w 


R 


R 


F 


R W A 


I 


I 


L 


F 


I 


1 


L 


F I L 














3361/1121 
















3391/1131 


ttc 


ccg 
P 


aac 


tat 


get 


gec 


atg 


aag ctg gtg 


aag 


ccc 


ttc 


age 


tga 


gga 


ctc 


tec tgc cct 


F 


N 


Y 


A 


A 


M 


K L V 


K 


p 


F 


S 


* 


G 


L 


S C P 














3451/1151 
















3481/1161 


tgg 
w 


gac 
D 


tgg 
w 


cct 

P 


gec 
A 


tec 
S 


tec 
S 


gec cag ctc 
A Q L 


ggc 

G 


gag 

E 


etc 
L 


ctc 
L 


cag 
Q 


acc 

T • 


tec 
S 


tag gec tga 
• A * 














3541/1181 
















3571/1191 


tgg 
W 


acc 
T 


ggc 

G 


cca 
P 


cac 
H 


tec 
S 


cag 
Q 


agt tgc taa 
S C 


cat 
H 


gga 

G 


get 
A 


ctg 
L 


aga 
R 


tea 

S 


ccc 

p 


cae ttc cat 
H F H 












3631/1211 
















3661/1221 


ttg 
L 


gat 
D 


cag 
Q 


ctc 

L 


aga 
R 


cat 
H 


att 
I 


tea gta taa 
S V * 


aac 
N 


agt 

S 


tgg 
w 


aac 
N 


cac 
H 


aaa 

K 


aaa 

K 


aaa aaa aaa 

K K K 



(SEQ ID HO: 233) 



Figure 6B 



BNSDOCID: <WO. 



_0011157A1_I_> 



WO 00/11157 



1/68 



PCT/US99/19395 



SEQUENCE LISTING 
<110> The General Hospital Corporation 

<120> DYSFERLIN, A GENE MUTATED IN DISTAL MYOPATHY AND LIMB 
GIRDLE MUSCULAR DYSTROPHY 



<130> 00786/399WO2 

<150> US 60/097,927 
<151> 1998-08-25 



<160> 233 



<170> FastSEQ for Windows Version 3.0 

<210> 1 

<211> 6911 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 

<222> (374) ••.(6613) 



<400> 1 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 60 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 
acacgcgcca age atg ctg agg gtc ttc ate etc tat gee gag aac gtc 
Met Leu Arg Val Phe lie Leu Tyr Ala Glu Asn Val 
15 10 



409 



cac aca ccc gac ace gac ate age gat gee tac tgc tec gcg gtg ttt 457 
His Thr Pro Asp Thr Asp lie Ser Asp Ala Tyr Cys Ser Ala Val Phe 
15 20 25 

gca ggg gtg aag aag aga ace aaa gtc ate aag aac age gtg aac cct 505 
Ala Gly Val Lys Lys Arg Thr Lys Val lie Lys Asn Ser Val Asn Pro 
30 35 40 

gta tgg aat gag gga ttt gaa tgg gac etc aag ggc ate ccc ctg gac 
Val Trp Asn Glu Gly Phe Glu Trp Asp Leu Lys Gly lie Pro Leu Asp 
45 50 55 60 

cag ggc tct gag ctt cat gtg gtg gtc aaa gac cat gag acg atg ggg 
Gin Gly Ser Glu Leu His Val Val Val Lys Asp His Glu Thr Met Gly 
65 70 75 

agg aac agg ttc ctg ggg gaa gee aag gtc cca etc cga gag gtc etc 
Ara Asn Arg Phe Leu Gly Glu Ala Lys Val Pro Leu Arg Glu Val Leu 
80 85 90 

gee acc cct agt ctg tec gec age ttc aat gec ccc ctg ctg gac acc 
Ala Thr Pro Ser Leu Ser Ala Ser Phe Asn Ala Pro Leu Leu Asp Thr 
95 100 105 

aag aag cag ccc aca ggg gee teg ctg gtc ctg cag gtg tec tac aca 745 
Lvs Lvs Gin Pro Thr Gly Ala Ser Leu Val Leu Gin Val Ser Tyr Thr 
110 115 120 



553 



601 



649 



697 
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cca ctg cct gga get gtg ccc ctg ttc ccg ccc cct act cct ctg gag 

Pro III Pro Gly Ala Val Pro Leu Phe Pro Pro Pro Thr Pro Leu Glu 

125 " 130 135 14U 

ccc tec ccg act ctg cct gac ctg gat gta gtg gca gac aca gga gga 
Pro III Pro Thr Leu Pro Asp Leu Asp Val Val Ala Asp Thr Gly Gly 
145 150 1 



aaa aaa gac aca gag gac cag gga etc act gga gat gag gcg gag cca 
llu llu Asp Thr Glu Asp Gin Gly Leu Thr Gly Asp Glu Ala Glu Pro 
160 165 170 

^4-^ r-+rr nat caa aoc aaa qgc ccg ggg get ccc acc acc cca agg aaa 
III Leu lap tin III lly G?y PrJ Gly Ala Pro Thr Thr Pro Arg Lys 



175 



eta cct tea cat cct ccg ccc cac tac ccc ggg ate aaa aga aag cga 
Itu Pro ser Arg Pro Pro Pro His Tyr Pro Gly lie Lys Arg Lys Arg 
190 195 200 

agt gcg cct aca tct aga aag ctg ctg tea gac aaa ccg cag gat ttc 
llr All Pro Thr Ser Arg Lys Leu Leu Ser Asp Lys Pro Gin Asp Phe 
205 210 215 220 

cag ate agg gtc cag gtg ate gag ggg cgc cag ctg ccg ggg gtg aac 
til tie Arf Val Gin Val He Glu Gly Arg Gin Leu Pro Gly Val Asn 
225 230 235 

ate aaa cct ata gtc aag gtt acc get gca ggg cag acc aag egg acg 
Til Lys III Val Val Lyl Val Thr Ala Ala Gly Gin Thr Lys Arg Thr 
240 245 250 

caa ate cac aag gga aac age cca etc ttc aat gag act ctt ttc ttc 
Ar! fie His Lyl Gly Asn Ser Pro Leu Phe Asn Glu Thr Leu Phe Phe 

260 265 



255 



aac ttg ttt gac tct cct ggg gag ctg ttt gat gag ccc ate ttt ate 
Asn Leu Phe Asp Ser Pro Gly Glu Leu Phe Asp Glu Pro He Phe He 
270 275 280 

acg gtg gta gac tct cgt tct etc agg aca gat get etc etc ggg gag 
Thf val Val Asp Ser Arg Ser Leu Arg Thr Asp Ala Leu Leu Gly Glu 
285 * 290 295 300 

ttc egg atg gac gtg ggc acc att tac aga gag ccc egg cac gee tat 
III Arg Met Asp Val Gly Thr He Tyr Arg Glu Pro Arg His Ala Tyr 
310 3x5 



305 



320 ~ 325 330 

gee aga ggc tac ctg aaa aca age ctt tgt gtg ctg ggg cct ggg gac 
Ala Arg Gly Tyr Leu Lys Thr Ser Leu Cys Val Leu Gly Pro Gly Asp 
335 340 345 

gaa gcg cct ctg gag aga aaa gac ccc tct gaa gac aag gag gac att 
Glu Ala Pro Leu Glu Arg Lys Asp Pro Ser Glu Asp Lys Glu Asp He 
350 355 360 



gaa age aac ctg etc egg ccc aca ggc gta gee ctg cga gga gec cac 
Glu sir Asn Leu Leu Arg Pro Thr Gly Val Ala Leu Arg Gly Ala His 
365 370 375 380 

ttc tgc ctg aag gtc ttc egg gec gag gac ttg ccg cag atg gac gat 
Phe Cys Leu Lys Val Phe Arg Ala Glu Asp Leu Pro Gin Met Asp Asp 



385 



793 



841 



889 



937 



985 



1033 



1081 



1129 



1177 



1225 



1273 



1321 



etc agg aag tgg ctg ctg etc tea gac cct gat gac ttc tct get ggg 1369 
Leu Arg Lys Trp Leu Leu Leu Ser Asp Pro Asp Asp Phe Ser Ala Gly 



1417 



1465 



1513 



1561 
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gcc gtg atg gac aac gtg aaa cag ate ttt ggc ttc gag agt aac aag 1609 
Ala Val Met Asp Asn Val Lys Gin lie Phe Gly Phe Glu Ser Asn Lys 
400 405 410 

aag aac ttg gtg gac ccc ttt gtg gag gtc age ttt gcg ggg aaa atg 1657 
Lvs ABn Leu Val Asp Pro Phe Val Glu Val Ser Phe Ala Gly Lys Met 
1 415 420 425 

ctg tgc age aag ate ttg gag aag acg gcc aac cct cag tgg aac cag 1705 
Leu Cys Ser Lys lie Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin 
430 435 440 

aac ate aca ctg cct gcc atg ttt ccc tec atg tgc gaa aaa atg agg 1753 
Asn lie Thr Leu Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg 
445 450 455 460 

att cgt ate ata gac tgg gac cgc ctg act cac aat gac ate gtg get 1801 
lie Ara lie lie Asp Trp Asp Arg Leu Thr His Asn Asp lie Val Ala 
465 470 47 5 

ace ace tac ctg agt atg teg aaa ate tct gcc cct gga gga gaa ata 1849 
Thr Thr Tyr Leu Ser Met Ser Lys He Ser Ala Pro Gly Gly Glu He 
480 485 490 

gaa gag gag cct gca ggt get gtc aag cct teg aaa gcc tea gac ttg 1897 
Glu Glu Glu Pro Ala Gly Ala Val Lys Pro Ser Lys Ala Ser Asp Leu 
495 " 500 505 

gat gac tac ctg ggc ttc etc ccc act ttt ggg ccc tgc tac ate aac 1945 
Asp Asp Tyr Leu Gly Phe Leu Pro Thr Phe Gly Pro Cys Tyr He Asn 
510 515 520 

etc tat ggc agt ccc aga gag ttc aca ggc ttc cca gac ccc tac aca 1993 
Leu Tyr Gly Ser Pro Arg Glu Phe Thr Gly Phe Pro Asp Pro Tyr Thr 
525 530 535 540 

gag etc aac aca ggc aag ggg gaa ggt gtg get tat cgt ggc egg ctt 2041 
Glu Leu Asn Thr Gly Lys Gly Glu Gly Val Ala Tyr Arg Gly Arg Leu 
545 550 555 

ctg etc tec ctg gag ace aag ctg gtg gag cac agt gaa cag aag gtg 
Leu Leu Ser Leu Glu Thr Lys Leu Val Glu His Ser Glu Gin Lys Val 
560 565 570 



2089 



2185 



gag gac ctt cct gcg gat gac ate etc egg gtg gag aag tac ctt agg 2137 
Glu Asp Leu Pro Ala Asp Asp He Leu Arg Val Glu Lys Tyr Leu Arg 
575 580 585 

agg cgc aag tac tec ctg ttt gcg gcc ttc tac tea gcc ace atg ctg 
Arg Arg Lys Tyr Ser Leu Phe Ala Ala Phe Tyr Ser Ala Thr Met Leu 
590 595 600 

cag gat gtg gat gat gcc ate cag ttt gag gtc age ate ggg aac tac 2233 
Gin Asp Val Asp Asp Ala He Gin Phe Glu Val Ser He Gly Asn Tyr 
605 610 615 620 

ggg aac aag ttc gac atg ace tgc ctg ccg ctg gcc tec ace act cag 
Glv Asn Lys Phe Asp Met Thr Cys Leu Pro Leu Ala Ser Thr Thr Gin 
* 62 5 630 635 



2281 



tac age cgt gca gtc ttt gac ggg tgc cac tac tac tac eta ccc tgg 2329 
Tyr Ser Arg Ala Val Phe Asp Gly Cys His Tyr Tyr Tyr Leu Pro Trp 
640 645 650 
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ggt aac gtg aaa cct gtg gtg gtg ctg tea tec tac tgg gag gac ate 

Gly Asn Val Lys Pro Val Val Val Leu Ser Ser Tyr Trp Glu Asp He 

655 660 665 

age cat aga ate gag act cag aac cag ctg ctt ggg att get gac egg 

Ser His Arg He Glu Thr Gin Asn Gin Leu Leu Gly He Ala Asp Arg 

670 675 680 

ctg gaa get ggc ctg gag cag gtc cac ctg gee ctg aag gcg cag tgc 

Leu Glu Ala Gly Leu Glu Gin Val His Leu Ala Leu Lys Ala Gin Cys 



685 



tec acg gag gac gtg gac teg ctg gtg get cag ctg acg gat gag etc 
Ser Thr Glu Asp Val Asp Ser Leu Val Ala Gin Leu Thr Asp Glu Leu 
710 715 



705 



ate gca ggc tgc age cag cct ctg ggt gac ate cat gag aca ccc tct 
lie Ala Gly Cys ser Gin Pro Leu Gly Asp He His Glu Thr Pro Ser 
720 725 730 

gee ace cac ctg gac cag tac ctg tac cag ctg cgc ace cat cac ctg 
Ala Thr His Leu Asp Gin Tyr Leu Tyr Gin Leu Arg Thr His His Leu 



735 



740 745 



age caa ate act gag get gee ctg gee ctg aag etc ggc cac agt gag 
Ser Gin He Thr Glu Ala Ala Leu Ala Leu Lys Leu Gly His Ser Glu 
750 755 760 

etc cct gca get ctg gag cag gcg gag gac tgg etc ctg cgt ctg cgt 
Leu Pro Ala Ala Leu Glu Gin Ala Glu Asp Trp Leu Leu Arg Leu Arg 
765 770 775 780 

qcc ctg gca gag gag ccc cag aac age ctg ccg gac ate gtc ate tgg 
Ala Leu Ala Glu Glu Pro Gin Asn Ser Leu Pro Asp He Val He Trp 
790 795 



785 



atg ctg cag gga gac aag cgt gtg gca tac cag egg gtg ccc gec cac 
Met Leu Gin Gly Asp Lys Arg Val Ala Tyr Gin Arg Val Pro Ala Hxs 
800 805 810 

caa gtc etc ttc tec egg egg ggt gee aac tac tgt ggc aag aat tgt 
Gin Val Leu Phe Ser Arg Arg Gly Ala Asn Tyr Cys Gly Lys Asn Cys 
815 820 825 

ggg aag eta cag aca ate ttt ctg aaa tat ccg atg gag aag gtg cct 
Gly Lys Leu Gin Thr He Phe Leu Lys Tyr Pro Met Glu Lys Val Pro 
830 835 840 

ggc gee egg atg cca gtg cag ata egg gtc aag ctg tgg ttt ggg etc 
Gly Ala Arc Met Pro Val Gin He Arg Val Lys Leu Trp Phe Gly Leu 
845 850 855 860 

tct gtg gat gag aag gag ttc aac cag ttt get gag ggg aag ctg tct 
Ser Val Asp Glu Lys Glu Phe Asn Gin Phe Ala Glu Gly Lys Leu Ser 
865 870 875 

gtc ttt get gaa acc tat gag aac gag act aag ttg gee ctt gtt ggg 
Val Phe Ala Glu Thr Tyr Glu Asn Glu Thr Lys Leu Ala Leu Val Gly 
880 885 890 

aac tgg ggc aca acg ggc etc acc tac ccc aag ttt tct gac gtc acg 
Asn Trp Gly Thr Thr Gly Leu Thr Tyr Pro Lys Phe Ser Asp Val Thr 
895 900 905 

ggc aag ate aag eta ccc aag gac age ttc cgc ccc teg gee ggc tgg 
Gly Lys He Lys Leu Pro Lys Asp Ser Phe Arg Pro Ser Ala Gly Trp 
910 915 920 



2377 



2425 



2473 



2521 



2569 



2617 



2665 



2713 



2761 



2809 



2857 



2905 



2953 



3001 



3049 



3097 



3145 
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acc tgg get gga gat tgg ttc gtg tgt ccg gag aag act ctg etc cat 3193 

Thr Trp Ala Gly Asp Trp Phe Val Cys Pro Glu Lys Thr Leu Leu His 

925 930 935 940 

gac atg gac gec ggt cac ctg age ttc gtg gaa gag gtg ttt gag aac 3241 

Aso Met Asp Ala Gly His Leu Ser Phe Val Glu Glu Val Phe Glu Asn 
* 945 950 955 



cag acc egg ctt ccc gga ggc cag tgg ate tac atg agt gac aac tac 
Gin Thr Arg Leu Pro Gly Gly Gin Trp lie Tyr Met Ser Asp Asn Tyr 
960 965 970 



gag egg aag ccg aag cac tgg gtc cct get gag aag atg tac tac aca 
Glu Arq Lys Pro Lys His Trp Val Pro Ala Glu Lys Met Tyr Tyr Thr 
1025 1030 1035 



aag aca gat gec ttc cgc cgc cgc cgc tgg cgc cgt cgc atg gag cca 
Lys Thr Asp Ala Phe Arg Arg Arg Arg Trp Arg Arg Arg Met Glu Pro 
— — 1090 1095 1100 



1085 



ttg age ttc ggt gtg aac aga ccc acg att tec tgc ata ttc gac tat 
Leu Ser Phe Gly Val Asn Arg Pro Thr lie Ser Cys lie Phe Asp Tyr 
1135 1140 1145 

ggg aac cgc tac cat eta cgc tgc tac atg tac cag gee egg gac ctg 
Gly Asn Arg Tyr His Leu Arg Cys Tyr Met Tyr Gin Ala Arg Asp Leu 
1150 1155 1160 



3289 



acc gat gtg aac ggg gag aag gtg ctt ccc aag gat gac att gag tgc 3337 
Thr Asp Val Asn Gly Glu Lys Val Leu Pro Lys Asp Asp lie Glu Cys 
975 " 980 985 

cca ctg ggc tgg aag tgg gaa gat gag gaa tgg tec aca gac etc aac 3385 
Pro Leu Gly Trp Lys Trp Glu Asp Glu Glu Trp Ser Thr Asp Leu Asn 
990 * *" 995 1000 

egg get gtc gat gag caa ggc tgg gag tat age ate acc ate ccc ccg 3433 
Arg Ala Val Asp Glu Gin Gly Trp Glu Tyr Ser lie Thr lie Pro Pro 
1005 1010 1015 1020 



3481 



cac cga egg egg cgc tgg gtg cgc ctg cgc agg agg gat etc age caa 3529 
His Arg Arg Arg Arg Trp Val Arg Leu Arg Arg Arg Asp Leu Ser Gin 
1040 1045 1050 

atg gaa gca ctg aaa agg cac agg cag gcg gag gcg gag ggc gag ggc 3577 
Met Glu Ala Leu Lys Arg His Arg Gin Ala Glu Ala Glu Gly Glu Gly 
1055 1060 1065 

tgg gag tac gee tct ctt ttt ggc tgg aag ttc cac etc gag tac cgc 3625 
Trp Glu Tyr Ala Ser Leu Phe Gly Trp Lys Phe His Leu Glu Tyr Arg 
1070 1075 1080 



3673 



ctg gag aag acg ggg cct gca get gtg ttt gec ctt gag ggg gee ctg 3721 
Leu Glu Lys Thr Gly Pro Ala Ala Val Phe Ala Leu Glu Gly Ala Leu 
1105 1110 1115 

ggc ggc gtg atg gat gac aag agt gaa gat tec atg tec gtc tec acc 37 69 

Gly Gly Val Met Asp Asp Lys Ser Glu Asp Ser Met Ser Val Ser Thr 
1120 ^ H25 1130 



3817 



3865 



get gcg atg gac aag gac tct ttt tct gat ccc tat gee ate gtc tec 3913 
Ala Ala Met Asp Lys Asp Ser Phe Ser Asp Pro Tyr Ala lie Val Ser 
1165 1170 1175 1180 



BNSDOCID: <WO 001 1 157A1_I_> 



WO 00/11157 



PCT/US99/19395 



6/68 

ttc ctg cac cag age cag aag acg gtg gtg gtg aag aac acc ctt aac 
Phe Leu His Gin Ser Gin Lys Thr Val Val Val Lys Asn Thr Leu Asn 
1185 1190 H95 

ccc acc tgg gac cag acg etc ate ttc tac gag ate gag ate ttt ggc 
Pro ?hr Trp Asp Gin Thr Leu He Phe Tyr Glu He Glu lie Phe Gly 
1200 1205 1210 

gag ccg gee aca gtt get gag caa ccg ccc age att gtg gtg gag ctg 
Glu PrS Ala Thr Val Ala Glu Gin Pro Pro Ser He Val Val Glu Leu 
1215 1220 1225 

tac gac cat gac act tat ggt gca gac gag ttt atg ggt cgc tgc ate 
Tyr Asp His Asp Thr Tyr Gly Ala Asp Glu Phe Met Gly Arg Cys lie 
1230 1235 1240 

tat caa ccg agt ctg gaa egg atg cca egg ctg gee tgg ttc cca ctg 
C?s Gin Pro Ser Leu Glu Arg Met Pro Arg Leu Ala Trp Phe Pro Leu 
!245 1250 1255 1260 

aca aaa aac age cag ccg teg ggg gag ctg ctg gec tct ttt gag etc 
?h? A?g Gly sir Gin Pro Ser Gly Glu Leu Leu Ala Ser Phe Glu Leu 
1265 1270 1275 

ate cag aga gag aag ccg gee ate cac cat att cct ggt ttt gag gtg 
lie Gin Arg Glu Lys Pro Ala lie His His He Pro Gly Phe Glu Val 
1280 1285 1290 

cag gag aca tea agg ate ctg gat gag tct gag gac aca gac ctg ccc 
Gin Glu Thr Ser Arg He Leu Asp Glu Ser Glu Asp Thr Asp Leu Pro 
1295 1300 1305 

tac cca cca ccc cag agg gag gec aac ate tac atg gtt cct cag aac 
Tyr Pro Pro Pro Gin Arg Glu Ala Asn He Tyr Met Val Pro Gin Asn 
1310 1315 1320 



ate aag cca gcg etc cag cgt acc gee ate gag ate ctg gca tgg ggc 
ile Lys Pro Ala Leu Gin Arg Thr Ala He Glu lie Leu Ala Trp Gly 
1325 1330 



1335 



1340 



ctg egg aac atg aag agt tac cag ctg gec aac ate tec tec ccc age 
Leu Arg Asn Me? Lys Ser Tyr Gin Leu Ala Asn lie Ser Ser Pro Ser 
1345 1350 1355 

etc ata ata aaa tat ggg ggc cag acg gtg cag tec tgt gtc ate agg 
llu Val Vat G?u Cys Gl? Gly Gin Thr Val Gin Ser Cys Val lie Arg 
1360 1365 1370 

aac etc egg aag aac ccc aac ttt gac ate tgc acc etc ttc atg gaa 
Asn Leu Arg Lys Asn Pro Asn Phe Asp lie Cys Thr Leu Phe Met Glu 
1375 1380 1385 

gtg atg ctg ccc agg gag gag etc tac tgc ccc ccc ate acc gtc aag 
Val Met Leu Pro Arg Glu Glu Leu Tyr Cys Pro Pro Ile Thr Val Lys 
1390 1395 1400 

gtc ate gat aac cgc cag ttt ggc cgc egg cct gtg gtg ggc cag tgt 
Val Ile Asp Asn Arg Gin Phe Gly Arg Arg Pro Val Val Gly Gin Cys 
1405 1410 1415 142U 

acc ate cgc tec ctg gag age ttc ctg tgt gac ccc tac teg gcg gag 
Thr Ile Arg Ser Leu Glu Ser Phe Leu Cys Asp Pro Tyr Ser Ala Glu 
1425 1430 143b 

agt cca tec cca cag ggt ggc cca gac gat gtg age eta etc agt cct 
Ser Pro Ser Pro Gin Gly Gly Pro Asp Asp Val Ser Leu Leu Ser Pro 
144 0 1445 1450 



3961 



4009 



4057 



4105 



4153 



4201 



4249 



4297 



4345 



4393 



4441 



4489 



4537 



4585 



4633 



4681 



4729 
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ggg gaa gac gtg etc ate gac att gat gac aag gag ccc etc ate ccc 
Gly Glu Asp Val Leu lie Asp lie Asp Asp Lys Glu Pro Leu lie Pro 
1455 1460 1465 

ate cag gag gaa gag ttc ate gat tgg tgg age aaa ttc ttt gee tec 
lie Gin Glu Glu Glu Phe lie Asp Trp Trp Ser Lys Phe Phe Ala Ser 
1470 1475 1480 

ata ggg gag agg gaa aag tgc ggc tec tac ctg gag aag gat ttt gac 
lie Gly Glu Arg Glu Lys Cys Gly Ser Tyr Leu Glu Lys Asp Phe Asp 
1485 1490 1495 1500 

acc ctg aag gtc tat gac aca cag ctg gag aat gtg gag gee ttt gag 
Thr Leu Lys Val Tyr Asp Thr Gin Leu Glu Asn Val Glu Ala Phe Glu 
1505 1510 1515 

ggc ctg tct gac ttt tgt aac acc ttc aag ctg tac egg ggc aag acg 
Gly Leu Ser Asp Phe Cys Asn Thr Phe Lys Leu Tyr Arg Gly Lys Thr 
1520 1525 1530 

cag gag gag aca gaa gat cca tct gtg att ggt gaa ttt aag ggc etc 
Gin Glu Glu Thr Glu Asp Pro Ser Val lie Gly Glu Phe Lys Gly Leu 
1535 1540 1545 

ttc aaa att tat ccc etc cca gaa gac cca gee ate ccc atg ccc cca 
Phe Lys lie Tyr Pro Leu Pro Glu Asp Pro Ala He Pro Met Pro Pro 
1550 " 1555 1560 

aga cag ttc cac cag ctg gec gee cag gga ccc cag gag tgc ttg gtc 
Arq Gin Phe His Gin Leu Ala Ala Gin Gly Pro Gin Glu Cys Leu Val 
1565 1570 1575 1580 

cgt ate tac att gtc cga gca ttt ggc ctg cag ccc aag gac ccc aat 
Arq He Tyr He Val Arg Ala Phe Gly Leu Gin Pro Lys Asp Pro Asn 
1585 1590 1595 

gga aag tgt gat cct tac ate aag ate tec ata ggg aag aaa tea gtg 
Glv Lys Cys Asp Pro Tyr He Lys He Ser He Gly Lys Lys Ser Val 
1600 1605 1610 

agt gac cag gat aac tac ate ccc tgc acg ctg gag ccc gta ttt gga 
Ser Asp Gin Asp Asn Tyr He Pro Cys Thr Leu Glu Pro Val Phe Gly 
1615 1620 1625 

aag atg ttc gag ctg acc tgc act ctg cct ctg gag aag gac eta aag 
Lys Met Phe Glu Leu Thr Cys Thr Leu Pro Leu Glu Lys Asp Leu Lys 
1630 1635 1640 

ate act etc tat gac tat gac etc etc tec aag gac gaa aag ate ggt 
He Thr Leu Tyr Asp Tyr Asp Leu Leu Ser Lys Asp Glu Lys He Gly 
1645 ' 1650 1655 1660 

gag acg gtc gtc gac ctg gag aac agg ctg ctg tec aag ttt ggg get 
Glu Thr Val Val Asp Leu Glu Asn Arg Leu Leu Ser Lys Phe Gly Ala 
1665 1670 1675 

cgc tgt gga etc cca cag acc tac tgt gtc tct gga ccg aac cag tgg 
Arg Cys Gly Leu Pro Gin Thr Tyr Cys Val Ser Gly Pro Asn Gin Trp 
1680 1685 1690 

egg gac cag etc cgc ccc tec cag etc etc cac etc ttc tgc cag cag 
Arq Asp Gin Leu Arg Pro Ser Gin Leu Leu His Leu Phe Cys Gin Gin 
1695 1700 1705 



4777 



4825 



4873 



4921 



4969 



5017 



5065 



5113 



5161 



5209 



5257 



5305 



5353 



5401 



5449 



5497 
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5545 



cat aaa ate aaq gca cct gtg tac egg aca gac cgt gta atg ttt cag 
His Arg Va? iyl Ala Pro Val Tyr Arg Thr Asp Arg Val Met Phe Gin 
1710 1715 1720 

oat aaa aaa tat tec att gaa gag ata gag get ggc agg ate cca aac 
Asp Lys Itu Tyr Ser He Glu Glu He Glu Ala Gly Arg He Pro Asn 
1725 1730 1735 1740 

cca cac ctg ggc cca gtg gag gag cgt ctg get ctg cat gtg ctt cag 
Pro His Leu Gly Pro val Glu Glu Arg ^eu^Ala Leu His Val J^Gln 



1745 



cag cag ggc ctg gtc ccg gag cac gtg gag tea egg cec etc tac age 
til Gin Gly Leu Val Pro Glu His Val Glu Ser Arg Pro Leu Tyr Ser 
1760 1765 1770 

cec eta cag cca gac ate gag cag ggg aag ctg cag atg tgg gtc gac 
Pro til III Pro Asp lie Glu Gin Gly Lys Leu Gin Met Trp Val Asp 
1775 1780 1785 

eta ttt ccg aag gee ctg ggg egg cct gga cct cec ttc aac ate acc 
Leu Phe Pro Lys Ala Leu Gly Arg Pro Gly Pro Pro Phe Asn He Thr 
1790 1795 1800 



cca egg aga gee aga agg ttt ttc ctg cgt tgt att ate tgg aat acc 

Pro Arg Arg Ala Arg Arg Phe Phe Leu Arg Cys lie He Trp Asn Thr 
1805 1810 1815 182C 

aaa aat ata ate ctg gat gac ctg age etc acg ggg gag aag atg age 

A?g Asp Val He Leu Asp Asp Leu Ser Leu Thr Gly Glu Lys Met Ser 

~ *~ _ — ~ i Din 1835 



1805 

gat gtg ate ctg gat gac ctg age 

ti ~ Leu Asp Asp Leu Ser - 

1825 1830 1835 

aac att tat ata aaa ggt tgg atg att ggc ttt gaa gaa cac aag caa 
Asp He T?r Val^Lys G?y Trp Met H^Gly Phe Glu Glu His^Lys Gin 

aaa aca gac gtg cat tat cgt tec ctg gga ggt gaa ggc aac ttc aac 
Lys Thr Asp Val His Tyr Arg Ser Leu Gly Gly Glu Gly Asn Phe Asn 
y 1855 I860 1865 

too aaa ttc att ttc cec ttc gac tac ctg cca get gag caa gtc tgt 
Xrp Arl Phe lie Phe Pro Phe Asp Tyr Leu Pro Ala Glu Gin Val Cys 
F 1870 1875 1880 

acc att acc aag aag gat gec ttc tgg agg ctg gac aag act gag age 
Thr lie Ala Lyl Lys Asp Ala Phe Trp Arg Leu Asp Lys Thr Glu Ser 
1885 1890 1895 190U 

aaa ate cca aca cga gtg gtg ttc cag ate tgg gac aat gac aag ttc 
Lys He Pro 111 Arg Val Val Phe Gin He Trp Asp Asn Asp Lys Phe 
1910 1915 



1905 



5593 



5641 



5689 



5737 



5785 



5833 



5881 



5929 



5977 



6025 



6073 



6121 



6169 



tec ttt gat gat ttt ctg ggc tec ctg cag etc gat etc aac cgc atg 
Ser Phe Asp Asp Phe Leu Gly Ser Leu Gin Leu Asp Leu Asn Arg Met 
1920 1925 1930 

cec aag cca gee aag aca gee aag aag tgc tec ttg gac cag ctg gat 6217 
Pro Lys Pro Ala Lys Thr Ala Lys Lys Cys Ser Leu Asp Gin Leu Asp 
1935 * 940 1945 

gat get ttc cac cca gaa tgg ttt gtg tec ctt ttt gag cag aaa aca 
Asp Ala Phe His Pro Glu Trp Phe Val Ser Leu Phe Glu Gin Lys Thr 
1950 1955 I960 



6265 



ata aaq aac tgg tgg cec tgt gta gca gaa gag ggt gag aag aaa ata 6313 
Val Lys Gly Tr? Trp Pro C?s Val Ala Glu Glu Gly Glu Lys Lys lie 
1965 1970 1975 1980 
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ctg gcg ggc aag ctg gaa atg acc ttg gag att gta gca gag agt gag 6361 
Leu Ala Gly Lys Leu Glu Met Thr Leu Glu He Val Ala Glu Ser Glu 
1985 1990 1995 

cat gag gag egg cct get ggc cag ggc egg gat gag ccc aac atg aac 6409 
His Glu Glu Arg Pro Ala Gly Gin Gly Arg Asp Glu Pro Asn Met Asn 
2000 2005 2010 

cct aag ctt gag gac cca agg cgc ccc gac acc tec ttc ctg tgg ttt 6457 
Pro Lys Leu Glu Asp Pro Arg Arg Pro Asp Thr Ser Phe Leu Trp Phe 
2015 2020 2025 

acc tec cca tac aag acc atg aag ttc ate ctg tgg egg cgt ttc egg 6505- 
Thr Ser Pro Tyr Lys Thr Met Lys Phe He Leu Trp Arg Arg Phe Arg 
2030 2035 2040 

tgg gee ate ate etc ttc ate ate etc ttc ate ctg ctg ctg ttc ctg 6553 
Trp Ala He He Leu Phe He He Leu Phe He Leu Leu Leu Phe Leu 
2045 2050 2055 2060 

gec ate ttc ate tac gee ttc ccg aac tat get gee atg aag ctg gtg 6601 
Ala lie Phe He Tyr Ala Phe Pro Asn Tyr Ala Ala Met Lys Leu Val 
2065 2070 2075 

aag ccc ttc age tgaggactct cctgccctgt agaaggggee gtggggtccc 6653 
Lys Pro Phe Ser 
2080 

ctccagcatg ggactggcct gcctcctccg cccagctcgg cgagctcctc cagacctcct 6713 

aggectgatt gtcctgccag ggtgggcaga cagacagatg gaccggccca eactcccaga 6773 

gttgetaaca tggagctctg agatcacccc acttccatca tttccttctc ccccaaccca 6833 

aegctttttt ggatcagctc agacatattt cagtataaaa cagttggaac cacaaaaaaa 6893 

aaaaaaaaaa aaaaaaaa 6911 



<210> 2 

<211> 2080 

<212> PRT 

<213> Homo sapiens 
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Leu 
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Ser Arg Lys Leu Leu Ser Asp Lys Pro Gin Asp Phe Gin lie Arg Val 

210 215 220 

Gin Val lie Glu Gly Arg Gin Leu Pro Gly Val Asn He Lys Pro Val 
225 230 235 240 

Val Lys Val Thr Ala Ala Gly Gin Thr Lys Arg Thr Arg He His Lys 

245 250 255 

Glv Asn Ser Pro Leu Phe Asn Glu Thr Leu Phe Phe Asn Leu Phe Asp 

260 265 270 

Ser Pro Gly Glu Leu Phe Asp Glu Pro He Phe He Thr Val Val Asp 

275 280 285 

Ser Arg Ser Leu Arg Thr Asp Ala Leu Leu Gly Glu Phe Arg Met Asp 

290 295 300 
Val Gly Thr He Tyr Arg Glu Pro Arg His Ala Tyr Leu Arg Lys Trp 
305 310 315 320 
Leu Leu Leu Ser Asp Pro Asp Asp Phe Ser Ala Gly Ala Arg Gly Tyr 
330 335 



325 



j — — 

Leu Lys Thr Ser Leu Cys Val Leu Gly Pro Gly Asp Glu Ala Pro Leu 

340 345 350 

Glu Arg Lys Asp Pro Ser Glu Asp Lys Glu Asp He Glu Ser Asn Leu 

355 360 365 

Leu Arg Pro Thr Gly Val Ala Leu Arg Gly Ala His Phe Cys Leu Lys 

370 - 375 380 

val Phe Ara Ala Glu Asp Leu Pro Gin Met Asp Asp Ala Val Met Asp 
335 y 390 395 400 

Asn Val Lys Gin He Phe Gly Phe Glu Ser Asn Lys Lys Asn Leu Val 

405 410 415 

As© Pro Phe Val Glu Val Ser Phe Ala Gly Lys Met Leu Cys Ser Lys 

P 420 «5 4 30 

He Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin Asn He Thr Leu 

435 440 44 5 

Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg He Arg He He 

450 455 4 60 

Asp Trp Asp Arg Leu Thr His Asn Asp He Val Ala Thr Thr Tyr Leu 
465 4 ?0 475 , 480 

Ser Met Ser Lys He Ser Ala Pro Gly Gly Glu He Glu Glu Glu Pro 

485 490 495 

Ala Gly Ala Val Lys Pro Ser Lys Ala Ser Asp Leu Asp Asp Tyr Leu 

500 505 510 

Gly Phe Leu Pro Thr Phe Gly Pro Cys Tyr He Asn Leu Tyr Gly Ser 

515 520 525 

Pro Arg Glu Phe Thr Gly Phe Pro Asp Pro Tyr Thr Glu Leu Asn Thr 

530 535 540 

Glv Lys Gly Glu Gly Val Ala Tyr Arg Gly Arg Leu Leu Leu Ser Leu 
545 550 555 560 

Glu Thr Lys Leu Val Glu His Ser Glu Gin Lys Val Glu Asp Leu Pro 

565 570 575 

Ala Asp Asp He Leu Arg Val Glu Lys Tyr Leu Arg Arg Arg Lys Tyr 

580 585 590 

Ser Leu Phe Ala Ala Phe Tyr Ser Ala Thr Met Leu Gin Asp Val Asp 

595 600 605 

Asp Ala He Gin Phe Glu Val Ser He Gly Asn Tyr Gly Asn Lys Phe 

610 615 620 

Asp Met Thr Cys Leu Pro Leu Ala Ser Thr Thr Gin Tyr Ser Arg Ala 
625 630 635 640 

Val Phe Asp Gly Cys His Tyr Tyr Tyr Leu Pro Trp Gly Asn Val Lys 

650 655 



645 



u ** ~> 

Pro Val Val Val Leu Ser Ser Tyr Trp Glu Asp He Ser His Arg He 

660 665 670 

Glu Thr Gin Asn Gin Leu Leu Gly He Ala Asp Arg Leu Glu Ala Gly 

675 680 685 

Leu Glu Gin Val His Leu Ala Leu Lys Ala Gin Cys Ser Thr Glu Asp 

690 695 700 

Val Asp Ser Leu Val Ala Gin Leu Thr Asp Glu Leu He Ala Gly Cys 
705 710 715 720 

Ser Gin Pro Leu Gly Asp He His Glu Thr Pro Ser Ala Thr His Leu 
725 730 735 
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Asp Gin Tyr Leu Tyr Gin Leu Arg Thr His His Leu Ser Gin lie Thr 

740 745 750 

Glu Ala Ala Leu Ala Leu Lys Leu Gly His Ser Glu Leu Pro Ala Ala 

755 760 765 

Leu Glu Gin Ala Glu Asp Trp Leu Leu Arg Leu Arg Ala Leu Ala Glu 

770 775 780 

Glu Pro Gin Asn Ser Leu Pro Asp lie Val lie Trp Met Leu Gin Gly 
785 790 795 800 

Asp Lys Arg Val Ala Tyr Gin Arg Val Pro Ala His Gin Val Leu Phe 

805 " 810 815 

Ser Arg Arg Gly Ala Asn Tyr Cys Gly Lys Asn Cys Gly Lys Leu Gin 

820 825 830 

Thr lie Phe Leu Lys Tyr Pro Met Glu Lys Val Pro Gly Ala Arg Met 

835 ~ 840 845 

Pro Val Gin He Arg Val Lys Leu Trp Phe Gly Leu Ser Val Asp Glu 

850 " 855 860 

Lys Glu Phe Asn Gin Phe Ala Glu Gly Lys Leu Ser Val Phe Ala Glu 
865 870 875 880 

Thr Tyr Glu Asn Glu Thr Lys Leu Ala Leu Val Gly Asn Trp Gly Thr 

885 890 895 

Thr Gly Leu Thr Tyr Pro Lys Phe Ser Asp Val Thr Gly Lys He Lys 

900 905 910 

Leu Pro Lys Asp Ser Phe Arg Pro Ser Ala Gly Trp Thr Trp Ala Gly 

915 920 925 

Asp Trp Phe Val Cys Pro Glu Lys Thr Leu Leu His Asp Met Asp Ala 

930 935 940 

Gly His Leu Ser Phe Val Glu Glu Val Phe Glu Asn Gin Thr Arg Leu 
945 950 955 960 

Pro Gly Gly Gin Trp He Tyr Met Ser Asp Asn Tyr Thr Asp Val Asn 

965 970 975 

Gly Glu Lys Val Leu Pro Lys Asp Asp He Glu Cys Pro Leu Gly Trp 

980 ~ 985 990 

Lys Trp Glu Asp Glu Glu Trp Ser Thr Asp Leu Asn Arg Ala Val Asp 

995 1000 1005 

Glu Gin Gly Trp Glu Tyr Ser He Thr He Pro Pro Glu Arg Lys Pro 

1010 1015 1020 

Lys His Trp Val Pro Ala Glu Lys Met Tyr Tyr Thr His Arg Arg Arg 
1025 1030 1035 1040 

Arg Trp Val Arg Leu Arg Arg Arg Asp Leu Ser Gin Met Glu Ala Leu 

1045 " ' 1050 1055 

Lys Arg His Arg Gin Ala Glu Ala Glu Gly Glu Gly Trp Glu Tyr Ala 

1060 1065 1070 

Ser Leu Phe Gly Trp Lys Phe His Leu Glu Tyr Arg Lys Thr Asp Ala 

1075 1080 1085 

Phe Arg Arg Arg Arg Trp Arg Arg Arg Met Glu Pro Leu Glu Lys Thr 

1090 1095 1100 

Gly Pro Ala Ala Val Phe Ala Leu Glu Gly Ala Leu Gly Gly Val Met 
1105 1110 1115 1120 

Asp Asp Lys Ser Glu Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly 

1125 1130 1135 

Val Asn Arg Pro Thr He Ser Cys He Phe Asp Tyr Gly Asn Arg Tyr 

1140 1145 1150 

His Leu Arg Cys Tyr Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp 

1155 ^ 1160 1165 

Lys Asp Ser Phe Ser Asp Pro Tyr Ala He Val Ser Phe Leu His Gin 

1170 1175 1180 

Ser Gin Lys Thr Val Val Val Lys Asn Thr Leu Asn Pro Thr Trp Asp 
1185 1190 1195 1200 

Gin Thr Leu He Phe Tyr Glu He Glu He Phe Gly Glu Pro Ala Thr 

1205 1210 1215 

Val Ala Glu Gin Pro Pro Ser He Val Val Glu Leu Tyr Asp His Asp 

1220 1225 1230 

Thr Tyr Gly Ala Asp Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser 

1235 1240 1245 

Leu Glu Arg Met Pro Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser 
1250 1255 1260 
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Gin Pro Ser Gly Glu Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu 
1265 1270 , 1275 l^BU 

£ys Pro Ala He His His lie Pro Gly Phe Glu Val Gin Glu Thr Ser 

1 1285 1290 1295 

Arg He Leu Asp Glu Ser Glu Asp Thr Asp Leu Pro Tyr fro Pro Pro 

* 1300 1305 1310 

Gin Arg Glu Ala Asn He Tyr Met Val Pro Gin Asn lie Lys Pro Ala 

1315 1320 1325 

Leu Gin Arg Thr Ala He Glu He Leu Ala Trp Gly Leu Arg Asn Met 

1330 1335 1340 

Lys Ser Tyr Gin Leu Ala Asn He Ser Ser Pro Ser Leu Val Val Glu 
1345 1350 1355 1360 

Cys Gly Gly Gin Thr val Gin Ser Cys Val^Ile Arg Asn Leu Arg^Lys 

Asn Pro Asn Phe As" lie Cys Thr Leu Phe Met Glu Val Met Leu Pro 

1380 1385 usu 

Ara Glu Glu Leu Tyr Cys Pro Pro He Thr Val Lys Val He Asp Asn 

* 1395 1400 1405 

Arg Gin Phe Gly Arg Arg Pro Val Val Gly Gin Cys Thr He Arg Ser 

" 1410 1415 1420 

Leu Glu Ser Phe Leu Cys Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro 
1425 1430 1435 1440 

Gin Gly Gly Pro Asp Asp Val Ser Leu Leaser Pro Gly Glu Aspjal 

Leu He Asp He Asp 5 Asp Lys Glu Pro Leu He Pro He Gin Glu Glu 

1460 1465 1470 

Glu Phe He Asp Trp Trp Ser Lys Phe Phe Ala Ser lie Gly Glu Arg 

1475 1480 1485 

Glu Lys Cys Gly Ser Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val 

1490 1495 1500 

Tyr Asp Thr Gin Leu Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp 
1505 1510 1515 152U 

Phe Cys Asn Thr Phe Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr 

1525 1530 lbJb 

Glu Asp Pro Ser Val He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr 

1540 1545 1550 

Pro Leu Pro Glu Asp Pro Ala He Pro Met Pro Pro Arg Gin Phe His 

1555 1560 1565 

Gin Leu Ala Ala Gin Gly Pro Gin Glu Cys Leu Val Arg He Tyr He 

1570 1575 1580 

Val Arg Ala Phe Gly Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp 
1585 1590 1595 1600 

Pro Tyr He Lys He Ser He Gly Lys Lys Ser Val Ser Asp Gin Asp 

1605 1610 1615 

Asn Tyr He Pro Cys Thr Leu Glu Pro Val Phe Gly Lys Met Phe Glu 

1620 1625 1630 

Leu Thr Cys Thr Leu Pro Leu Glu Lys Asp Leu Lys He Thr Leu Tyr 

1635 1640 1645 

Asp Tyr Asp Leu Leu Ser Lys Asp Glu Lys He Gly Glu Thr Val Val 

1650 1655 1660 

Asp Leu Glu Asn Arg Leu Leu Ser Lys Phe Gly Ala Arg Cys Gly Leu 
1665 1670 1675 1680 

Pro Gin Thr Tyr Cys Val Ser Gly Pro Asn Gin Trp Arg Asp Gin Leu 

1685 1690 1695 

Arg Pro Ser Gin Leu Leu His Leu Phe Cys Gin Gin His Arg Val Lys 

1700 1705 1710 

Ala Pro Val Tyr Arg Thr Asp Arg Val Met Phe Gin Asp Lys Glu Tyr 

1715 1720 1725 

Ser He Glu Glu He Glu Ala Gly Arg He Pro Asn Pro His Leu Gly 

1730 1735 1740 

Pro Val Glu Glu Arg Leu Ala Leu His Val Leu Gin Gin Gin Gly Leu 
1745 1750 1755 1760 

Val Pro Glu His Val Glu Ser Arg Pro Leu Tyr Ser Pro Leu Gin Pro 

1765 1770 1775 

Asp He Glu Gin Gly Lys Leu Gin Met Trp Val Asp Leu Phe Pro Lys 
1780 1785 1'90 
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Ala Leu Gly Arg Pro Gly Pro Pro Phe Asn He Thr Pro Arg Arg Ala 

1795 1800 1805 

Arg Arg Phe Phe Leu Arg Cys He He Trp Asn Thr Arg Asp Val He 

1810 1815 1820 

Leu Asp Asp Leu Ser Leu Thr Gly Glu Lys Met Ser Asp He Tyr Val 
1825 1830 1835 1840 

Lvs Glv Trp Met He Gly Phe Glu Glu His Lys Gin Lys Thr Asp Val 

1845 1850 1855 

His Tyr Arg Ser Leu Gly Gly Glu Gly Asn Phe Asn Trp Arg Phe He 

1860 1865 1870 

Phe Pro Phe Asp Tyr Leu Pro Ala Glu Gin Val Cys Thr He Ala Lys 

1875 1880 1885 

Lvs Asp Ala Phe Trp Arg Leu Asp Lys Thr Glu Ser Lys He Pro Ala 

1890 1895 1900 

Ara Val Val Phe Gin He Trp Asp Asn Asp Lys Phe Ser Phe Asp Asp 
1905 1910 1915 1920 

Phe Leu Gly Ser Leu Gin Leu Asp Leu Asn Arg Met Pro Lys Pro Ala 

1925 1930 1935 

Lvs Thr Ala Lys Lys Cys Ser Leu Asp Gin Leu Asp Asp Ala Phe His 

1940 1945 1950 

Pro Glu Trp Phe Val Ser Leu Phe Glu Gin Lys Thr Val Lys Gly Trp 

1955 I960 1965 

Trp Pro Cys Val Ala Glu Glu Gly Glu Lys Lys He Leu Ala Gly Lys 

1970 1975 1980 

Leu Glu Met Thr Leu Glu He Val Ala Glu Ser Glu His Glu Glu Arg 
1985 1990 1995 2000 

Pro Ala Gly Gin Gly Arg Asp Glu Pro Asn Met Asn Pro Lys Leu Glu 

2005 2010 2015 

Asp Pro Arg Arg Pro Asp Thr Ser Phe Leu Trp Phe Thr Ser Pro Tyr 

* 2020 2025 2030 

Lvs Thr Met Lys Phe He Leu Trp Arg Arg Phe Arg Trp Ala He He 

2035 2040 2045 

Leu Phe He He Leu Phe He Leu Leu Leu Phe Leu Ala He Phe He 

2050 2055 2060 

Tvr Ala Phe Pro Asn Tyr Ala Ala Met Lys Leu Val Lys Pro Phe Ser 
2065 2070 2075 2080 



<210> 3 

<211> 5915 

<212> DNA 

<213> Homo sapiens 

<400> 3 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga 
agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg 
catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg 
gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg 
tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca 
gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt 
tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag 
agaggaagac acagaggacc agggactcac tggagatgag gcggagccat 
aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc 
ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc 
accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc 
catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc 
gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact 
gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca 
tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc 
tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg 
cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg 



ccggagcatt 
cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggccct 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 
acacaggagg 
tcctggatca 
ctccgcccca 
tgtcagacaa 
cgggggtgaa 
ggatccacaa 
ctcctgggga 
ggacagatgc 
ggcacgccta 
ccagaggcta 
agagaaaaga 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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cccctctgaa 
gcgaggagcc 
tgccgtgatg 
ggaccccttt 
gacggccaac 
cgaaaaaatg 
taccacctac 
tgcaggtgct 
cacttttggg 
agacccctac 
tctgctctcc 
tgcggatgac 
ggccttctac 
catcgggaac 
gtacagccgt 
acctgtggtg 
ccagctgctt 
gaaggcgcag 
catcgcaggc 
ggaccagtac 
ggccctgaag 
cctgcgtctg 
gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgcrtgtgga 



gacaaggagg 
cacttctgcc 
gacaacgtga 
gtggaggtca 
cctcagtgga 
aggattcgta 
ctgagtatgt 
gtcaagcctt 
ccctgctaca 
acagagctca 
ctggagacca 
atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 



acattgaaag 
tgaaggtctt 
aacagatctt 
gctttgcggg 
accagaacat 
tcatagactg 
cgaaaatctc 
cgaaagcctc 
tcaacctcta 
acacaggcaa 
agctggtgga 
tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 



caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 

gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 



ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 



ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 



cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggcrtggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccrttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 



gcgtagccct 
agatggacga 
agaacttggt 
tcttggagaa 
cctccatgtg 
acatcgtggc 
aagaggagcc 
gcttcctccc 
caggcttccc 
gtggccggct 
aggaccttcc 
ccctgtttgc 
ttgaggtcag 
ccaccactca 
gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 
tgtactacac 
tggaagcact 
ctctttttgg 
gctggcgccg 
agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 
agccggccac 
cttatggtgc 
cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 



1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ccgcccctcc cagctcctcc acctcttctg ccagcagcat agagtcaagg cacctgtgta 5520 

ccggacagac cgtgtaatgt ttcaggataa agaatattcc attgaagaga tagaggctgg 5580 

caggatccca aacccacacc tgggcccagt ggaggagcgt ctggctctgc atgtgcttca 5640 

gcagcagggc ctggtcccgg agcacgtgga gtcacggccc ctctacagcc ccctgcagcc 5700 

agacatcgag caggggaagc tgcagatgtg ggtcgaccta tttccgaagg ccctggggcg 5760 

gcctggacct cccttcaaca tcaccccacg gagagccaga aggtttttcc tgcgttgtat 5820 

tatctggaat accagagatg tgatcctgga tgacctgagc ctcacggggg agaagatgag 5880 

cgacatttat gtgaaaggtt ggatgattgg ctttg 5915 

<210> 4 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 4 
tgggacctca aagggcatcc 

<210> 5 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 5 

accatgctgt aggatgtgga 20 

<210> 6 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 6 

2 0 

gggaggtgaa gcaacttcaa 

<210> 7 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 7 
ctcacggggt agaagatgag 

<210> 8 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 8 

2 0 

cagggccgag atgagcccaa 

<210> 9 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 9 

acatcaaggg tcctggatga zu 

<210> 10 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 10 

ctgtggcggt gtttccggtg 20 

<210> 11 
<211> 20 
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<212> DNA 

<213> Homo sapiens 

<400> 11 
acagacgtgc gttatcgttc 

<210> 12 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 12 
aagactgagc aaaatcccag 

<210> 13 

<211> 6912 

<212> DNA 

<213> Homo sapiens 

<400> 13 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag 
tqttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg 
agcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga 
aotcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg 
qcatccccct ggaccagggc tctgagcttc atgtggtggt caaagaccat 
qqaggaacag gttcctgggg gaagccaagg tcccactccg agaggtcctc 
gtctgtccgc cagcttcaat gcccccctgc tggacaccaa gaagcagccc 
cgctggtcct gcaggtgtcc tacacaccgc tgcctggagc tgtgcccctg 
ctactcctct ggagccctcc ccgactctgc ctgacctgga tgtagtggca 
gagaggaaga cacagaggac cagggactca ctggagatga ggcggagcca 
aaagcggagg cccgggggct cccaccaccc caaggaaact accttcacgt 
actaccccgg gatcaaaaga aagcgaagtg cgcctacatc tagaaagctg 
aaccgcagga tttccagatc agggtccagg tgatcgaggg gcgccagctg 
acatcaagcc tgtggtcaag gttaccgctg cagggcagac caagcggacg 
agggaaacag cccactcttc aatgagactc ttttcttcaa cttgtttgac 
agctgtttga tgagcccatc tttatcacgg tggtagactc tcgttctctc 
ctctcctcgg ggagttccgg atggacgtgg gcaccattta cagagagccc 
atctcaggaa gtggctgctg ctctcagacc ctgatgactt ctctgctggg 
acctgaaaac aagcctttgt gtgctggggc ctggggacga agcgcctctg 
acccctctga agacaaggag gacattgaaa gcaacctgct ccggcccaca 
tacqaggagc ccacttctgc ctgaaggtct tccgggccga ggacttgccg 
atgccgtgat ggacaacgtg aaacagatct ttggcttcga gagtaacaag 
tggacccctt tgtggaggtc agctttgcgg ggaaaatgct gtgcagcaag 
agacggccaa ccctcagtgg aaccagaaca tcacactgcc tgccatgttt 
qcqaaaaaat gaggattcgt atcatagact gggaccgcct gactcacaat 
ctaccaccta cctgagtatg tcgaaaatct ctgcccctgg aggagaaata 
ctgcaggtgc tgtcaagcct tcgaaagcct cagacttgga tgactacctg 
ccacttttgg gccctgctac atcaacctct atggcagtcc fagagagttc 
cagaccccta cacagagctc aacacaggca agggggaagg tgtggcttat 
ttctgctctc cctggagacc aagctggtgg agcacagtga acagaaggtg 
ctgcggatga catcctccgg gtggagaagt accttaggag gcgcaagtac 
cqqccttcta ctcagccacc atgctgtagg atgtggatga tgccatccag 
qcatcgggaa ctacgggaac aagttcgaca tgacctgcct gccgctggcc 
aqtacagccg tgcagtcttt gacgggtgcc actactacta cctaccctgg 
aacctgtggt ggtgctgtca tcctactggg aggacatcag ccatagaatc 
accagctgct tgggattgct gaccggctgg aagctggcct ggagcaggtc 
tgaaggcgca gtgctccacg gaggacgtgg actcgctggt ggctcagctg 
tcatcgcagg ctgcagccag cctctgggtg acatccatga gacaccctct 
tggaccagta cctgtaccag ctgcgcaccc atcacctgag ccaaatcact 
tqqccctgaa gctcggccac agtgagctcc ctgcagctct ggagcaggcg 
tcctgcgtct gcgtgccctg gcagaggagc cccagaacag cctgccggac 
ggatgctgca gggagacaag cgtgtggcat accagcgggt gcccgcccac 



20 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaagg 

gagacgatgg 

gccaccccta 

acaggggcct 

ttcccgcccc 

gacacaggag 

ttcctggatc 

cctccgcccc 

ctgtcagaca 

ccgggggtga 

cggatccaca 

tctcctgggg 

aggacagatg 

cggcacgcct 

gccagaggct 

gagagaaaag 

ggcgtagccc 

cagatggacg 

aagaacttgg 

atcttggaga 

ccctccatgt 

gacatcgtgg 

gaagaggagc 

ggcttcctcc 

acaggcttcc 

cgtggccggc 

gaggaccttc 

tccctgtttg 

tttgaggtca 

tccaccactc 

ggtaacgtga 

gagactcaga 

cacctggccc 

acggatgagc 

gccacccacc 

gaggctgccc 

gaggactggc 

atcgtcatct 

caagtcctct 



20 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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tctcccggcg 
tgaaatatcc 
tgtggtttgg 
ctgtctttgc 
caacgggcct 
acagcttccg 
agactctgct 
accagacccg 
acggggagaa 
atgaggaatg 
tcaccatccc 
cacaccgacg 
tgaaaaggca 
gctggaagtt 
gtcgcatgga 
tgggcggcgt 
gtgtgaacag 
gctacatgta 
atgccatcgt 
accccacctg 
cagttgctga 
cagacgagtt 
cctggttccc 
tcatccagag 
caaggatcct 
ccaacatcta 
tcctggcatg 
gcctcgtggt 
agaaccccaa 
tctactgccc 
tggtgggcca 
agagtccatc 
tgctcatcga 
attggtggag 
agaaggattt 
agggcctgtc 
cagaagatcc 
aagacccagc 
aggagtgctt 
atggaaagtg 
ataactacat 
ctctgcctct 
acgaaaagat 
ctcgctgtgg 
tccgcccctc 
accggacaga 
gcaggatccc 
agcagcaggg 
cagacatcga 
ggcctggacc 
ttatctggaa 
gcgacattta 
tgcattatcg 
actacctgcc 
acaagactga 
tctcctttga 
ccaagacagc 
ttgtgtccct 
gtgagaagaa 
agcatgagga 
aggacccaag 
agttcatcct 
tgctgctgtt 
tgaagccctt 
catgggactg 
gattgtcctg 
aacatggagc 



gggtgccaac 

gatggagaag 

gctctctgtg 

tgaaacctat 

cacctacccc 

cccctcggcc 

ccatgacatg 

gcttcccgga 

ggtgcttccc 

gtccacagac 

cccggagcgg 

gcggcgctgg 

caggcaggcg 

ccacctcgag 

gccactggag 

gatggatgac 

acccacgatt 

ccaggcccgg 

ctccttcctg 

ggaccagacg 

gcaaccgccc 

tatgggtcgc 

actgacgagg 

agagaagccg 

ggatgagtct 

catggttcct 

gggcctgcgg 

agagtgtggg 

ctttgacatc 

ccccatcacc 

gtgtaccatc 

cccacagggt 

cattgatgac 

caaattcttt 

tgacaccctg 

tgacttttgt 

atctgtgatt 

catccccatg 

ggtccgtatc 

tgatccttac 

cccctgcacg 

ggagaaggac 

cggtgagacg 

actcccacag 

ccagctcctc 

ccgtgtaatg 

aaacccacac 

cctggtcccg 

gcaggggaag 

tcccttcaac 

taccagagat 

tgtgaaaggt 

ttccctggga 

agctgagcaa 

gagcaaaatc 

tgattttctg 

caagaagtgc 

ttttgagcag 

aatactggcg 

gcggcctgct 

gcgccccgac 

gtggcggcgt 

cctggccatc 

cagctgagga 

gcctgcctcc 

ccagggtggg 

tctgagatca 



tactgtggca 

gtgcctggcg 

gatgagaagg 

gagaacgaga 

aagttttctg 

ggctggacct 

gacgccggtc 

ggccagtgga 

aaggatgaca 

ctcaaccggg 

aagccgaagc 

gtgcgcctgc 

gaggcggagg 

taccgcaaga 

aagacggggc 

aagagtgaag 

tcctgcatat 

gacctggctg 

caccagagcc 

ctcatcttct 

agcattgtgg 

tgcatctgtc 

ggcagccagc 

gccatccacc 

gaggacacag 

cagaacatca 

aacatgaaga 

ggccagacgg 

tgcaccctct 

gtcaaggtca 

cgctccctgg 

ggcccagacg 

aaggagcccc 

gcctccatag 

aaggtctatg 

aacaccttca 

ggtgaattta 

cccccaagac 

tacattgtcc 

atcaagatct 

ctggagcccg 

ctaaagatca 

gtcgtcgacc 

acctactgtg 

cacctcttct 

tttcaggata 

ctgggcccag 

gagcacgtgg 

ctgcagatgt 

atcaccccac 

gtgatcctgg 

tggatgattg 

ggtgaaggca 

gtctgtacca 

ccagcacgag 

ggctccctgc 

tccttggacc 

aaaacagtga 

ggcaagctgg 

ggccagggcc 

acctccttcc 

ttccggtggg 

ttcatctacg 

ctctcctgcc 

tccgcccagc 

cagacagaca 

ccccacttcc 



17/68 

agaattgtgg 

cccggatgcc 

agttcaacca 

ctaagttggc 

acgtcacggg 

gggctggaga 

acctgagctt 

tctacatgag 

ttgagtgccc 

ctgtcgatga 

actgggtccc 

gcaggaggga 

gcgagggctg 

cagatgcctt 

ctgcagctgt 

attccatgtc 

tcgactatgg 

cgatggacaa 

agaagacggt 

acgagatcga 

tggagctgta 

aaccgagtct 

cgtcggggga 

atattcctgg 

acctgcccta 

agccagcgct 

gttaccagct 

tgcagtcctg 

tcatggaagt 

tcgataaccg 

agagcttcct 

atgtgagcct 

tcatccccat 

gggagaggga 

acacacagct 

agctgtaccg 

agggcctctt 

agttccacca 

gagcatttgg 

ccatagggaa 

tatttggaaa 

ctctctatga 

tggagaacag 

tctctggacc 

gccagcagca 

aagaatattc 

tggaggagcg 

agtcacggcc 

gggtcgacct 

ggagagccag 

atgacctgag 

gctttgaaga 

acttcaactg 

ttgccaagaa 

tggtgttcca 

agctcgatct 

agctggatga 

agggctggtg 

aaatgacctt 

gggatgagcc 

tgtggtttac 

ccatcatcct 

ccttcccgaa 

ctgtagaagg 

tcggcgagct 

gatggaccgg 

atcatttcct 



gaagctacag 
agtgcagata 
gtttgctgag 
ccttgttggg 
caagatcaag 
ttggttcgtg 
cgtggaagag 
tgacaactac 
actgggctgg 
gcaaggctgg 
tgctgagaag 
tctcagccaa 
ggagtacgcc 
ccgccgccgc 
gtttgccctt 
cgtctccacc 
gaaccgctac 
ggactctttt 
ggtggtgaag 
gatctttggc 
cgaccatgac 
ggaacggatg 
gctgctggcc 
ttttgaggtg 
cccaccaccc 
ccagcgtacc 
ggccaacatc 
tgtcatcagg 
gatgctgccc 
ccagtttggc 
gtgtgacccc 
actcagtcct 
ccaggaggaa 
aaagtgcggc 
ggagaatgtg 
gggcaagacg 
caaaatttat 
gctggccgcc 
cctgcagccc 
gaaatcagtg 
gatgttcgag 
ctatgacctc 
gctgctgtcc 
gaaccagtgg 
tagagtcaag 
cattgaagag 
tctggctctg 
cctctacagc 
atttccgaag 
aaggtttttc 
cctcacgggg 
acacaagcaa 
gaggttcatt 
ggatgccttc 
gatctgggac 
caaccgcatg 
tgctttccac 
gccctgtgta 
ggagattgta 
caacatgaac 
ctccccatac 
cttcatcatc 
ctatgctgcc 
ggccgtgggg 
cctccagacc 
cccacactcc 
tctcccccaa 



acaatctttc 

cgggtcaagc 

gggaagctgt 

aactggggca 

ctacccaagg 

tgtccggaga 

gtgtttgaga 

accgatgtga 

aagtgggaag 

gagtatagca 

atgtactaca 

atggaagcac 

tctctttttg 

cgctggcgcc 

gagggggccc 

ttgagcttcg 

catctacgct 

tctgatccct 

aacaccctta 

gagccggcca 

acttatggtg 

ccacggctgg 

tcttttgagc 

caggagacat 

cagagggagg 

gccatcgaga 

tcctccccca 

aacctccgga 

agggaggagc 

cgccggcctg 

tactcggcgg 

ggggaagacg 

gagttcatcg 

tcctacctgg 

gaggcctttg 

caggaggaga 

cccctcccag 

cagggacccc 

aaggacccca 

agtgaccagg 

ctgacctgca 

ctctccaagg 

aagtttgggg 

cgggaccagc 

gcacctgtgt 

atagaggctg 

catgtgcttc 

cccctgcagc 

gccctggggc 

ctgcgttgta 

gagaagatga 

aagacagacg 

ttccccttcg 

tggaggctgg 

aatgacaagt 

cccaagccag 

ccagaatggt 

gcagaagagg 

gcagagagtg 

cctaagcttg 

aagaccatga 

ctcttcatcc 

atgaagctgg 

tcccctccag 

tcctaggcct 

cagagttgct 

cccaacgctt 



2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 
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ttttggatca gctcagacat atttcagtat aaaacagttg gaaccacaaa aaaaaaaaaa 
aaaaaaaaaa aa 

<210> 14 

<211> 6911 

<212> DNA 

<213> Homo sapiens 



6900 
6912 



<400> 14 
tcgaccgccc agccaggtgc 
agattacagc tcgacggagc 
tgttctcgga acgccggctg 
gcccactgga gcagccgggg 
agccagagat tcgagccggc 
ggcgcctcgg ccctcccgac 
acacgcgcca agcatgctga 
caccgacatc agcgatgcct 
agtcatcaag aacagcgtga 
catccccctg gaccagggct 
gaggaacagg ttcctggggg 
tctgtccgcc agcttcaatg 
gctggtcctg caggtgtcct 
tactcctctg gagccctccc 
agaggaagac acagaggacc 
aagcggaggc ccgggggctc 
ctaccccggg atcaaaagaa 
accgcaggat ttccagatca 
catcaagcct gtggtcaagg 
gggaaacagc ccactcttca 
gctgtttgat gagcccatct 
tctcctcggg gagttccgga 
tctcaggaag tggctgctgc 
cctgaaaaca agcctttgtg 
cccctctgaa gacaaggagg 
gcgaggagcc cacttctgcc 
tgccgtgatg gacaacgtga 
ggaccccttt gtggaggtca 
gacggccaac cctcagtgga 
cgaaaaaatg aggattcgta 
taccacctac ctgagtatgt 
tgcaggtgct gtcaagcctt 
cacttttggg ccctgctaca 
agacccctac acagagctca 
tctgctctcc ctggagacca 
tgcggatgac atcctccggg 
ggccttctac tcagccacca 
catcgggaac tacgggaaca 
gtacagccgt gcagtctttg 
acctgtggtg gtgctgtcat 
ccagctgctt gggattgctg 
gaaggcgcag tgctccacgg 
catcgcaggc tgcagccagc 
ggaccagtac ctgtaccagc 
ggccctgaag ctcggccaca 
cctgcgtctg cgtgccctgg 
gatgctgcag ggagacaagc 
ctcccggcgg ggtgccaact 
gaaatatccg atggagaagg 
gtggtttggg ctctctgtgg 
tgtctttgct gaaacctatg 
aacgggcctc acctacccca 
cagcttccgc ccctcggccg 
gactctgctc catgacatgg 
ccagacccgg cttcccggag 
cggggagaag gtgcttccca 
tgaggaatgg tccacagacc 
caccatcccc ccggagcgga 



aaaatgccgt 
tcgggaaggg 
acaagcgggg 
gtggcccgtt 
ctcgcccagc 
ctttccgagc 
gggtcttcat 
actgctccgc 
accctgtatg 
ctgagcttca 
aagccaaggt 
cccccctgct 
acacaccgct 
cgactctgcc 
agggactcac 
ccaccacccc 
agcgaagtgc 
gggtccaggt 
ttaccgctgc 
atgagactct 
ttatcacggt 
tggacgtggg 
tctcagaccc 
tgctggggcc 
acattgaaag 
tgaaggtctt 
aacagatctt 
gctttgcggg 
accagaacat 
tcatagactg 
cgaaaatctc 
cgaaagcctc 
tcaacctcta 
acacaggcaa 
agctggtgga 
tggagaagta 
tgctgtagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 



gtcattggga 
cggcgggggt 
tgagcgcagg 
cccctttaag 
cagccctctc 
cctctttgcg 
cctctatgcc 
ggtgtttgca 
gaatgaggga 
tgtggtggtc 
cccactccga 
ggacaccaag 
gcctggagct 
tgacctggat 
tggagatgag 
aaggaaacta 
gcctacatct 
gatcgagggg 
agggcagacc 
tttcttcaac 
ggtagactct 
caccatttac 
tgatgacttc 
tggggacgaa 
caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 
gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 



gactccgcag 
ggaagatgag 
cggggcgggg 
agcaactgct 
cagcgagggg 
ccctgggcgc 
gagaacgtcc 
ggggtgaaga 
tttgaatggg 
aaagaccatg 
gaggtcctcg 
aagcagccca 
gtgcccctgt 
gtagtggcag 
gcggagccat 
ccttcacgtc 
agaaagctgc 
cgccagctgc 
aagcggacgc 
ttgtttgact 
cgttctctca 
agagagcccc 
tctgctgggg 
gcgcctctgg 
cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 



ccggagcatt 
cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggccct 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 
acacaggagg 
tcctggatca 
ctccgcccca 
tgtcagacaa 

cgggggtgaa 
ggatccacaa 
ctcctgggga 
ggacagatgc 
ggcacgccta 
ccagaggcta 
agagaaaaga 
gcgtagccct 
agatggacga 
agaacttggt 
tcttggagaa 
cctccatgtg 
acatcgtggc 
aagaggagcc 
gcttcctccc 
caggcttccc 
gtggccggct 
aggaccttcc 
ccctgtttgc 
ttgaggtcag 
ccaccactca 
gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 
tgtactacac 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
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acaccgacgg 

gaaaaggcac 

ctggaagttc 

tcgcatggag 

gggcggcgtg 

tgtgaacaga 

ctacatgtac 

tgccatcgtc 

ccccacctgg 

agttgctgag 

agacgagttt 

ctggttccca 

catccagaga 

aaggatcctg 

caacatctac 

cctggcatgg 

cctcgtggta 

gaaccccaac 

ctactgcccc 

ggtgggccag 

gagtccatcc 

gctcatcgac 

ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcattatcgt 

ctacctgcca 

caagactgag 

ctcctttgat 

caagacagcc 

tgtgtccctt 

tgagaagaaa 

gcatgaggag 

ggacccaagg 

gttcatcctg 

gctgctgttc 

gaagcccttc 

atgggactgg 

attgtcctgc 

acatggagct 

tttggatcag 

aaaaaaaaaa 



cggcgctggg 

aggcaggcgg 

cacctcgagt 

ccactggaga 

atggatgaca 

cccacgattt 

caggcccggg 

tccttcctgc 

gaccagacgc 

caaccgccca 

atgggtcgct 

ctgacgaggg 

gagaagccgg 

gatgagtctg 

atggttcctc 

ggcctgcgga 

gagtgtgggg 

tttgacatct 

cccatcaccg 

tgtaccatcc 

ccacagggtg 

attgatgaca 

aaattctttg 

gacaccctga 

gacttttgta 

tctgtgattg 

atccccatgc 

gtccgtatct 

gatccttaca 

ccctgcacgc 

gagaaggacc 

ggtgagacgg 

ctcccacaga 

cagctcctcc 

cgtgtaatgt 

aacccacacc 

ctggtcccgg 

caggggaagc 

cccttcaaca 

accagagatg 

gtgaaaggtt 

tccctgggag 

gctgagcaag 

agcaaaatcc 

gattttctgg 

aagaagtgct 

tttgagcaga 

atactggcgg 

cggcctgctg 

cgccccgaca 

tggcggcgtt 

ctggccatct 

agctgaggac 

cctgcctcct 

cagggtgggc 

ctgagatcac 

ctcagacata 



tgcgcctgcg 

aggcggaggg 

accgcaagac 

agacggggcc 

agagtgaaga 

cctgcatatt 

acctggctgc 

accagagcca 

tcatcttcta 

gcattgtggt 

gcatctgtca 

gcagccagcc 

ccatccacca 

aggacacaga 

agaacatcaa 

acatgaagag 

gccagacggt 

gcaccctctt 

tcaaggtcat 

gctccctgga 

gcccagacga 

aggagcccct 

cctccatagg 

aggtctatga 

acaccttcaa 

gtgaatttaa 

ccccaagaca 

acattgtccg 

tcaagatctc 

tggagcccgt 

taaagatcac 

tcgtcgacct 

cctactgtgt 

acctcttctg 

ttcaggataa 

tgggcccagt 

agcacgtgga 

tgcagatgtg 

tcaccccacg 

tgatcctgga 

ggatgattgg 

gtgaaggcaa 

tctgtaccat 

cagcacgagt 

gctccctgca 

ccttggacca 

aaacagtgaa 

gcaagctgga 

gccagggccg 

cctccttcct 

tccggtgggc 

tcatctacgc 

tctcctgccc 

ccgcccagct 

agacagacag 

cccacttcca 

tttcagtata 



caggagggat 

cgagggctgg 

agatgccttc 

tgcagctgtg 

ttccatgtcc 

cgactatggg 

gatggacaag 

gaagacggtg 

cgagatcgag 

ggagctgtac 

accgagtctg 

gtcgggggag 

tattcctggt 

cctgccctac 

gccagcgctc 

ttaccagctg 

gcagtcctgt 

catggaagtg 

cgataaccgc 

gagcttcctg 

tgtgagccta 

catccccatc 

ggagagggaa 

cacacagctg 

gctgtaccgg 

gggcctcttc 

gttccaccag 

agcatttggc 

catagggaag 

atttggaaag 

tctctatgac 

ggagaacagg 

ctctggaccg 

ccagcagcat 

agaatattcc 

ggaggagcgt 

gtcacggccc 

ggtcgaccta 

gagagccaga 

tgacctgagc 

ctttgaagaa 

cttcaactgg 

tgccaagaag 

ggtgttccag 

gctcgatctc 

gctggatgat 

gggctggtgg 

aatgaccttg 

ggatgagccc 

gtggtttacc 

catcatcctc 

cttcccgaac 

tgtagaaggg 

cggcgagctc 

atggaccggc 

tcatttcctt 

aaacagttgg 



ctcagccaaa 

gagtacgcct 

cgccgccgcc 

tttgcccttg 

gtctccacct 

aaccgctacc 

gactcttttt 

gtggtgaaga 

atctttggcg 

gaccatgaca 

gaacggatgc 

ctgctggcct 

tttgaggtgc 

ccaccacccc 

cagcgtaccg 

gccaacatct 

gtcatcagga 

atgctgccca 

cagtttggcc 

tgtgacccct 

ctcagtcctg 

caggaggaag 

aagtgcggct 

gagaatgtgg 

ggcaagacgc 

aaaatttatc 

ctggccgccc 

ctgcagccca 

aaatcagtga 

atgttcgagc 

tatgacctcc 

ctgctgtcca 

aaccagtggc 

agagtcaagg 

attgaagaga 

ctggctctgc 

ctctacagcc 

tttccgaagg 

aggtttttcc 

ctcacggggg 

cacaagcaaa 

aggttcattt 

gatgccttct 

atctgggaca 

aaccgcatgc 

gctttccacc 

ccctgtgtag 

gagattgtag 

aacatgaacc 

tccccataca 

ttcatcatcc 

tatgctgcca 

gccgtggggt 

ctccagacct 

ccacactccc 

ctcccccaac 

aaccacaaaa 



tggaagcact 

ctctttttgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctaqctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

tccccttcga 

ggaggctgga 

atgacaagtt 

ccaagccagc 

cagaatggtt 

cagaagaggg 

cagagagtga 

ctaagcttga 

agaccatgaa 

tcttcatcct 

tgaagctggt 

cccctccagc 

cctaggcctg 

agagttgcta 

ccaacgcttt 

aaaaaaaaaa 



3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6911 



<210> 15 

<211> 6910 

<212> DNA 

<213> Homo sapiens 

<400> 15 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 



60 
120 
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tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

■tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 

cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtttgc 2160 

ggccttctac tcagccacca tgctgcagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgcrtgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatcrtttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

cggggagaag gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 

acaccgacgg cggcgctggg tgcgcctgcg caggagggat ctcagccaaa tggaagcact 3540 

gaaaaggcac aggcaggcgg aggcggaggg cgagggctgg gagtacgcct ctctttttgg 3600 

ctggaagttc cacctcgagt accgcaagac agatgccttc cgccgccgcc gctggcgccg 3660 

tcgcatggag ccactggaga agacggggcc tgcagctgtg tttgcccttg agggggccct 3720 

gggcggcgtg atggatgaca agagtgaaga ttccatgtcc gtctccacct tgagcttcgg 3780 

tgtgaacaga cccacgattt cctgcatatt cgactatggg aaccgctacc atctacgctg 3840 

ctacatgtac caggcccggg acctggctgc gatggacaag gactcttttt ctgatcccta 3900 

tgccatcgtc tccttcctgc accagagcca gaagacggtg gtggtgaaga acacccttaa 3960 

ccccacctgg gaccagacgc tcatcttcta cgagatcgag atctttggcg agccggccac 4020 

agttgctgag caaccgccca gcattgtggt ggagctgtac gaccatgaca cttatggtgc 4080 

agacgagt-tt atgggtcgct gcatctgtca accgagtctg gaacggatgc cacggctggc 4140 
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ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
tacctgccag 
aagactgaga 
tcctttgatg 
aagacagcca 
gtgtcccttt 
gagaagaaaa 
catgaggagc 
gacccaaggc 
ttcatcctgt 
ctgctgttcc 
aagcccttca 
tgggactggc 
ttgtcctgcc 
catggagctc 
ttggatcagc 
aaaaaaaaaa 



ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
ctgagcaagt 
gcaaaatccc 
attttctggg 
agaagtgctc 
ttgagcagaa 
tactggcggg 
ggcctgctgg 
gccccgacac 
ggcggcgttt 
tggccatctt 
gctgaggact 
ctgcctcctc 
agggtgggca 
tgagatcacc 
tcagacatat 



gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaagcaac 
ctgtaccatt 
agcacgagtg 
ctccctgcag 
cttggaccag 
aacagtgaag 
caagctggaa 
ccagggccgg 
ctccttcctg 
ccggtgggcc 
catctacgcc 
ctcctgccct 
cgcccagctc 
gacagacaga 
ccacttccat 
ttcagtataa 
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gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
ttcaactgga 
gccaagaagg 
gtgttccaga 
ctcgatctca 
ctggatgatg 
ggctggtggc 
atgaccttgg 
gatgagccca 
tggtttacct 
atcatcctct 
ttcccgaact 
gtagaagggg 
ggcgagctcc 
tggaccggcc 
catttccttc 
aacagttgga 



ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
ggttcatttt 
atgccttctg 
tctgggacaa 
accgcatgcc 
ctttccaccc 
cctgtgtagc 
agattgtagc 
acatgaaccc 
ccccatacaa 
tcatcatcct 
atgctgccat 
ccgtggggtc 
tccagacctc 
cacactccca 
tcccccaacc 
accacaaaaa 



cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
ccccttcgac 
gaggctggac 
tgacaagttc 
caagccagcc 
agaatggttt 
agaagagggt 
agagagtgag 
taagcttgag 
gaccatgaag 
cttcatcctg 
gaagcrtggtg 
ccctccagca 
ctaggcctga 
gagttgctaa 
caacgctttt 
aaaaaaaaaa 



4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6910 



<210> 16 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 16 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 60 

agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 

tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 
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tactcctctg 
agaggaagac 
aagcggaggc 
ctaccccggg 
accgcaggat 
catcaagcct 
gggaaacagc 
gctgtttgat 
tctcctcggg 
tctcaggaag 
cctgaaaaca 
cccctctgaa 
gcgaggagcc 
tgccgtgatg 
ggaccccttt 
gacggccaac 
cgaaaaaatg 
taccacctac 
tgcaggtgct 
cacttttggg 
agacccctac 
tctgctctcc 
tgcggatgac 
ggccttctac 
catcgggaac 
gtacagccgt 
acctgtggtg 
ccagctgctt 
gaaggcgcag 
catcgcaggc 
ggaccagtac 
ggccctgaag 
cctgcgtctg 
gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 



gagccctccc 
acagaggacc 
ccgggggctc 
atcaaaagaa 
ttccagatca 
gtggtcaagg 
ccactcttca 
gagcccatct 
gagttccgga 
tggctgctgc 
agcctttgtg 
gacaaggagg 
cacttctgcc 
gacaacgtga 
gtggaggtca 
cctcagtgga 
aggattcgta 
ctgagtatgt 
gtcaagcctt 
ccctgctaca 
acagagctca 
ctggagacca 
atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 



cgactctgcc 
agggactcac 
ccaccacccc 
agcgaagtgc 
gggtccaggt 
ttaccgctgc 
atgagactct 
ttatcacggt 
tggacgtggg 
tctcagaccc 
tgctggggcc 
acattgaaag 
tgaaggtctt 
aacagatctt 
gctttgcggg 
accagaacat 
tcatagactg 
cgaaaatctc 
cgaaagcctc 
tcaacctcta 
acacaggcaa 
agctggtgga 
tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 



aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 



tgacctggat 
tggagatgag 
aaggaaacta 
gcctacatct 
gatcgagggg 
agggcagacc 
tttcttcaac 
ggtagactct 
caccatttac 
tgatgacttc 
tggggacgaa 
caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 
gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 



gtagtggcag 
gcggagccat 
ccttcacgtc 
agaaagctgc 
cgccagctgc 
aagcggacgc 
ttgtttgact 
cgttctctca 
agagagcccc 
tctgctgggg 
gcgcctctgg 
cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 



acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 

gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 

ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 

aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 

tgtactacac 

tggaagcact 

ctctttttgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3640 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
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ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
-tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 
ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
•tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



ggagagggaa aagtgcggct 
cacacagctg gagaatgtgg 
gctgtaccgg ggcaagacgc 
gggcctcttc aaaatttatc 
gttccaccag ctggccgccc 
agcatttggc ctgcagccca 
catagggaag aaatcagtga 
atttggaaag atgttcgagc 
tctctatgac tatgacctcc 
ggagaacagg ctgctgtcca 
ctctggaccg aaccagtggc 
ccagcagcat agagtcaagg 
agaatattcc attgaagaga 
ggaggagcgt ctggctctgc 
gtcacggccc ctctacagcc 
ggtcgaccta tttccgaagg 
gagagccaga aggtttttcc 
tgacctgagc ctcacggggt 
ctttgaagaa cacaagcaaa 
cttcaactgg aggttcattt 
tgccaagaag gatgccttct 
ggtgttccag atctgggaca 
gctcgatctc aaccgcatgc 
gctggatgat gctttccacc 
gggctggtgg ccctgtgtag 
aatgaccttg gagattgtag 
ggatgagccc aacatgaacc 
gtggtttacc tccccataca 
catcatcctc ttcatcatcc 
cttcccgaac tatgctgcca 
tgtagaaggg gccgtggggt 
cggcgagctc ctccagacct 
atggaccggc ccacactccc 
tcatttcctt ctcccccaac 
aaacagttgg aaccacaaaa 



cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6911 



<210> 17 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 17 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 60 

agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 

tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 
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cccctctgaa 
gcgaggagcc 
tgccgtgatg 
ggaccccttt 
gacggccaac 
cgaaaaaatg 
taccacctac 
tgcaggtgct 
cacttttggg 
agacccctac 
tctgctctcc 
tgcggatgac 
ggccttctac 
catcgggaac 
gtacagccgt 
acctgtggtg 
ccagctgctt 
gaaggcgcag 
catcgcaggc 
ggaccagtac 
ggccctgaag 
cctgcgtctg 
gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
ccrtcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 



gacaaggagg 
cacttctgcc 
gacaacgtga 
gtggaggtca 
cctcagtgga 
aggattcgta 
ctgagtatgt 
gtcaagcctt 
ccctgctaca 
acagagctca 
ctggagacca 
atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 



acattgaaag 
tgaaggtctt 
aacagatctt 
gctttgcggg 
accagaacat 
tcatagactg 
cgaaaatctc 
cgaaagcctc 
tcaacctcta 
acacaggcaa 
agctggtgga 
tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 



caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 
gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 



cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 



gcgtagccct 
agatggacga 
agaacttggt 
tcttggagaa 
cctccatgtg 
acatcgtggc 
aagaggagcc 
gcttcctccc 
caggcttccc 
gtggccggct 
aggaccttcc 
ccctgtttgc 
ttgaggtcag 
ccaccactca 
gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 
tgtactacac 
tggaagcact 
ctctttttgg 
gctggcgccg 
agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 



agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 



1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 
ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 
gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
agatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 
aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



cacctgtg-ta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



<210> 18 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 18 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 
agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 
catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 
gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 
tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 
gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 
tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 
agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 
aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 
ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 
accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 
catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 
gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 
gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 
tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 
tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 
cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 
cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 
gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 
tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 
ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 
gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 
cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 
taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 
tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 
cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 
agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 
tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 
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tgcggatgac 
ggccttctac 
catcgggaac 
gtacagccgt 
acctgtggtg 
ccagctgctt 
gaaggcgcag 
catcgcaggc 
ggaccagtac 
ggccctgaag 
cctgcgtctg 
gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aagggtcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 



atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 



tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 



ccttaggagg 

tgtggatgat 

gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
crtttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 



cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 



ccctgtttgc 
ttgaggtcag 
ccaccactca 



gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 
tgtactacac 
tggaagcact 
ctctttttgg 
gctggcgccg 

agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 
agccggccac 
cttatggtgc 
cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
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ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct: 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
ggatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6911 



<210> 19 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 19 

tcgaccgccc agccaggtgc aaaatgccgt 
agattacagc tcgacggagc tcgggaaggg 
tgttctcgga acgccggctg acaagcgggg 
gcccactgga gcagccgggg gtggcccgtt 
agccagagat tcgagccggc ctcgcccagc 
ggcgcctcgg ccctcccgac ctttccgagc 
acacgcgcca agcatgctga gggtcttcat 
caccgacatc agcgatgcct actgctccgc 
agtcatcaag aacagcgtga accctgtatg 
catccccctg gaccagggct ctgagcttca 
gaggaacagg ttcctggggg aagccaaggt 
tctgtccgcc agcttcaatg cccccctgct 
gctggtcctg caggtgtcct acacaccgct 
tactcctctg gagccctccc cgactctgcc 
agaggaagac acagaggacc agggactcac 
aagcggaggc ccgggggctc ccaccacccc 
ctaccccggg atcaaaagaa agcgaagtgc 
accgcaggat ttccagatca gggtccaggt 
catcaagcct gtggtcaagg ttaccgctgc 
gggaaacagc ccactcttca atgagactct 
gctgtttgat gagcccatct ttatcacggt 
tctcctcggg gagttccgga tggacgtggg 
tctcaggaag tggctgctgc tctcagaccc 
cctgaaaaca agcctttgtg tgctggggcc 
cccctctgaa gacaaggagg acattgaaag 
gcgaggagcc cacttctgcc tgaaggtctt 
tgccgtgatg gacaacgtga aacagatctt 
ggaccccttt gtggaggtca gctttgcggg 
gacggccaac cctcagtgga accagaacat 
cgaaaaaatg aggattcgta tcatagactg 
taccacctac ctgagtatgt cgaaaatctc 
tgcaggtgct gtcaagcctt cgaaagcctc 
cacttttggg ccctgctaca tcaacctcta 
agacccctac acagagctca acacaggcaa 
tctgctctcc ctggagacca agctggtgga 
tgcggatgac atcctccggg tggagaagta 
ggccttctac tcagccacca tgctgcagga 
catcgggaac tacgggaaca agttcgacat 
gtacagccgt gcagtctttg acgggtgcca 
acctgtggtg gtgctgtcat cctactggga 
ccagctgctt gggattgctg accggctgga 
gaaggcgcag tgctccacgg aggacgtgga 
catcgcaggc tgcagccagc ctctgggtga 
ggaccagtac ctgtaccagc tgcgcaccca 
ggccctgaag ctcggccaca gtgagctccc 
cctgcgtctg cgtgccctgg cagaggagcc 



gtcattggga 
cggcgggggt 
tgagcgcagg 
cccctttaag 
cagccctctc 
cctctttgcg 
cctctatgcc 
ggtgtttgca 
gaatgaggga 
tgtggtggtc 
cccactccga 
ggacaccaag 
gcctggagct 
tgacctggat 
tggagatgag 
aaggaaacta 
gcctacatct 
gatcgagggg 
agggcagacc 
tttcttcaac 
ggtagactct 
caccatttac 
tgatgacttc 
tggggacgaa 
caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 

gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 



gactccgcag 

ggaagatgag 

cggggcgggg 

agcaactgct 

cagcgagggg 

ccctgggcgc 

gagaacgtcc 

ggggtgaaga 

tttgaatggg 

aaagaccatg 

gaggtcctcg 

aagcagccca 

gtgcccctgt 

gtagtggcag 

gcggagccat 

ccttcacgtc 

agaaagctgc 

cgccagctgc 

aagcggacgc 

ttgtttgact 

cgttctctca 

agagagcccc 

tctgctgggg 

gcgcctctgg 

cggcccacag 

gacttgccgc 

agtaacaaga 

tgcagcaaga 

gccatgtttc 

actcacaatg 

ggagaaatag 

gactacctgg 

agagagttca 

gtggcttatc 

cagaaggtgg 

cgcaagtact 

gccatccagt 

ccgctggcct 

ctaccctggg 

catagaatcg 

gagcaggtcc 

gctcagctga 

acaccctctg 

caaatcactg 

gagcaggcgg 

ctgccggaca 



ccggagcatt 
cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggccct 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 
acacaggagg 
tcctggatca 
ctccgcccca 
tgtcagacaa 
cgggggtgaa 
ggatccacaa 
ctcctgggga 
ggacagatgc 
ggcacgccta 
ccagaggcta 
agagaaaaga 
gcgtagccct 
agatggacga 
agaacttggt 
tcttggagaa 
cctccatgtg 
acatcgtggc 
aagaggagcc 
gcttcctccc 
caggcttccc 
gtggccggct 
aggaccttcc 
ccctgtttgc 
ttgaggtcag 
ccaccactca 
gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
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gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcrttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 
ctcct/ttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 



ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggtgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 



gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt. 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 



ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagfcgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 
gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
ggatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 



cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 
aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 



aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 
tgtactacac 
tggaagcact 
ctctttttgg 
gctggcgccg 
agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 
agccggccac 
cttatggtgc 
cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agtrtcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
.5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
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acatggagct ctgagatcac cccacttcca tcatttcctt ctcccccaac ccaacgcttt 
tttggatcag ctcagacata tttcagtata aaacagttgg aaccacaaaa aaaaaaaaaa 
aaaaaaaaaa a 

<210> 20 

<211> 6911 

<212> DNA 

<213> Homo sapiens 



6840 
6900 
6911 



<400> 20 



tcgaccgccc 

agattacagc 

tgttctcgga 

gcccactgga 

agccagagat 

ggcgcctcgg 

acacgcgcca 

caccgacatc 

agtcatcaag 

catccccctg 

gaggaacagg 

tctgtccgcc 

gctggtcctg 

tactcctctg 

agaggaagac 

aagcggaggc 

ctaccccggg 

accgcaggat 

catcaagcct 

gggaaacagc 

gctgtttgat 

tctcctcggg 

-tctcaggaag 

cctgaaaaca 

cccctctgaa 

gcgaggagcc 

tgccgtgatg 

ggaccccttt 

gacggccaac 

cgaaaaaatg 

taccacctac 

tgcaggtgct 

cacttttggg 

agacccctac 

tctgctctcc 

tgcggatgac 

ggccttctac 

catcgggaac 

gtacagccgt 

acctgtggtg 

ccagctgctt 

gaaggcgcag 

catcgcaggc 

ggaccagtac 

ggccctgaag 

cctgcgtctg 

gatgctgcag 

ctcccggcgg 

gaaatatccg 

gtggtttggg 

tgtctttgct 

aacgggcctc 

cagcttccgc 

gactctgctc 

ccagacccgg 

cggggagaag 

tgaggaatgg 



agccaggtgc 

tcgacggagc 

acgccggctg 

gcagccgggg 

tcgagccggc 

ccctcccgac 

agcatgctga 

agcgatgcct 

aacagcgtga 

gaccagggct 

ttcctggggg 

agcttcaatg 

caggtgtcct 

gagccctccc 

acagaggacc 

ccgggggctc 

atcaaaagaa 

ttccagatca 

gtggtcaagg 

ccactcttca 

gagcccatct 

gagttccgga 

tggctgctgc 

agcctttgtg 

gacaaggagg 

cacttctgcc 

gacaacgtga 

gtggaggtca 

cctcagtgga 

aggattcgta 

ctgagtatgt 

gtcaagcctt 

ccctgctaca 

acagagctca 

ctggagacca 

atcctccggg 

tcagccacca 

tacgggaaca 

gcagtctttg 

gtgctgtcat 

gggattgctg 

tgctccacgg 

tgcagccagc 

ctgtaccagc 

ctcggccaca 

cgtgccctgg 

ggagacaagc 

ggtgccaact 

atggagaagg 

ctctctgtgg 

gaaacctatg 

acctacccca 

ccctcggccg 

catgacatgg 

cttcccggag 

gtgcttccca 

tccacagacc 



aaaatgccgt 

tcgggaaggg 

acaagcgggg 

gtggcccgtt 

ctcgcccagc 

ctttccgagc 

gggtcttcat 

actgctccgc 

accctgtatg 

ctgagcttca 

aagccaaggt 

cccccctgct 

acacaccgct 

cgactctgcc 

agggactcac 

ccaccacccc 

agcgaagtgc 

gggtccaggt 

ttaccgctgc 

atgagactct 

ttatcacggt 

tggacgtggg 

tctcagaccc 

tgctggggcc 

acattgaaag 

tgaaggtctt 

aacagatctt 

gctttgcggg 

accagaacat 

tcatagactg 

cgaaaatctc 

cgaaagcctc 

tcaacctcta 

acacaggcaa 

agctggtgga 

tggagaagta 

tgctgcagga 

agttcgacat 

acgggtgcca 

cctactggga 

accggctgga 

aggacgtgga 

ctctgggtga 

tgcgcaccca 

gtgagctccc 

cagaggagcc 

gtgtggcata 

actgtggcaa 

tgcctggcgc 

atgagaagga 

agaacgagac 

agttttctga 

gctggacctg 

acgccggtca 

gccagtggat 

aggatgacat 

tcaaccgggc 



gtcattggga 

cggcgggggt 

tgagcgcagg 

cccctttaag 

cagccctctc 

cctctttgcg 

cctctatgcc 

ggtgtttgca 

gaatgaggga 

tgtggtggtc 

cccactccga 

ggacaccaag 

gcctggagct 

tgacctggat 

tggagatgag 

aaggaaacta 

gcctacatct 

gatcgagggg 

agggcagacc 

tttcttcaac 

ggtagactct 

caccatttac 

tgatgacttc 

tggggacgaa 

caacctgctc 

ccgggccgag 

tggcttcgag 

gaaaatgctg 

cacactgcct 

ggaccgcctg 

tgcccctgga 

agacttggat 

tggcagtccc 

gggggaaggt 

gcacagtgaa 

ccttaggagg 

tgtggatgat 

gacctgcctg 

ctactactac 

ggacatcagc 

agctggcctg 

ctcgctggtg 

catccatgag 

tcacctgagc 

tgcagctctg 

ccagaacagc 

ccagcgggtg 

gaattgtggg 

ccggatgcca 

gttcaaccag 

taagttggcc 

cgtcacgggc 

ggctggagat 

cctgagcttc 

ctacatgagt 

tgagtgccca 

tgtcgatgag 



gactccgcag 
ggaagatgag 
cggggcgggg 
agcaactgct 
cagcgagggg 
ccctgggcgc 
gagaacgtcc 
ggggtgaaga 
tttgaatggg 
aaagaccatg 
gaggtcctcg 
aagcagccca 
gtgcccctgt 
gtagtggcag 
gcggagccat 
ccttcacgtc 
agaaagctgc 
cgccagctgc 
aagcggacgc 
ttgtttgact 
cgttctctca 
agagagcccc 
tctgctgggg 
gcgcctctgg 
cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 



ccggagcatt 
cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggccct 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 
acacaggagg 
tcctggatca 
ctccgcccca 
tgtcagacaa 
cgggggtgaa 
ggatccacaa 
ctcctgggga 
ggacagatgc 
ggcacgccta 
ccagaggcta 
agagaaaaga 
gcgtagccct 
agatggacga 
agaacttggt 
tcttggagaa 
cctccatgtg 
acatcgtggc 
aagaggagcc 
gcttcctccc 
caggcttccc 
gtggccggct 
aggaccttcc 
ccctgtttgc 
ttgaggtcag 
ccaccactca 
gtaacgtgaa 
agactcagaa 
acctggccct 
cggatgagct 
ccacccacct 
aggctgccct 
aggactggct 
tcgtcatctg 
aagtcctctt 
caatctttct 
gggtcaagct 
ggaagctgtc 
actggggcac 
tacccaagga 
gtccggagaa 
tgtttgagaa 
ccgatgtgaa 
agtgggaaga 
agtatagcat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
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caccatcccc 
acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 

ggt999 cca 9 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcgttatcgt 
ctaccrtgcca 
caagactgag 
ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 
gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
ggatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 
aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



tgtactacac 
tggaagcact 
ctctttttgg 
gctggcgccg 
agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 
agccggccac 
cttatggtgc 
cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 
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3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6911 



<210> 21 

<211> 6909 

<212> DNA 

<213> Homo sapiens 

tcgaccgccc°agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 



60 
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agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 

tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 

cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtttgc 2160 

ggccttctac tcagccacca tgctgcagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgctgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2 520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatctttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

cggggagaag gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 

acaccgacgg cggcgctggg tgcgcctgcg caggagggat ctcagccaaa tggaagcact 3540 

gaaaaggcac aggcaggcgg aggcggaggg cgagggctgg gagtacgcct ctctttttgg 3600 

ctggaagttc cacctcgagt accgcaagac agatgccttc cgccgccgcc gctggcgccg 3660 

tcgcatggag ccactggaga agacggggcc tgcagctgtg tttgcccttg agggggccct 3720 

gggcggcgtg atggatgaca agagtgaaga ttccatgtcc gtctccacct tgagcttcgg 3780 

tgtgaacaga cccacgattt cctgcatatt cgactatggg aaccgctacc atctacgctg 3840 

ctacatgtac caggcccggg acctggctgc gatggacaag gactcttttt ctgatcccta 3900 

tgccatcgtc tccttcctgc accagagcca gaagacggtg gtggtgaaga acacccttaa 3960 

ccccacctgg gaccagacgc tcatcttcta cgagatcgag atctttggcg agccggccac 4020 

agttgctgag caaccgccca gcattgtggt ggagctgtac gaccatgaca cttatggtgc 4080 
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agacgagttt 

ctggttccca 

catccagaga 

aaggatcctg 

caacatctac 

cctggcatgg 

cctcgtggta 

gaaccccaac 

crtactgcccc 

ggtgggccag 

gagtccatcc 

gctcatcgac 

ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcattatcgt 

ctacctgcca 

caagactgag 

cctttgatga 

agacagccaa 

tgtccctttt 

agaagaaaat 

atgaggagcg 

acccaaggcg 

tcatcctgtg 

tgctgttcct 

agcccttcag 

gggactggcc 

tgtcctgcca 

atggagctct 

tggatcagct 

aaaaaaaaa 



atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
caaaatccca 
ttttctgggc 
gaagtgctcc 
tgagcagaaa 
actggcgggc 
gcctgctggc 
ccccgacacc 
gcggcgtttc 
ggccatcttc 
ctgaggactc 
tgcctcctcc 
gggtgggcag 
gagatcaccc 
cagacatatt 



gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
gcacgagtgg 
tccctgcagc 
ttggaccagc 
acagtgaagg 
aagctggaaa 
cagggccggg 
tccttcctgt 
cggtgggcca 
atctacgcct 
tcctgccctg 
gcccagctcg 
acagacagat 
cacttccatc 
tcagtataaa 



accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
tgttccagat 
tcgatctcaa 
tggatgatgc 
gctggtggcc 
tgaccttgga 
atgagcccaa 
ggtttacctc 
tcatcctctt 
tcccgaacta 
tagaaggggc 
gcgagctcct 
ggaccggccc 
atttccttct 



gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
ctgggacaat 
ccgcatgccc 
tttccaccca 
ctgtgtagca 
gattgtagca 
catgaaccct 
cccatacaag 
catcatcctc 
tgctgccatg 
cgtggggtcc 
ccagacctcc 
acactcccag 
cccccaaccc 



acagttggaa ccacaaaaaa 



cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctacctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
gacaagttct 
aagccagcca 
gaatggtttg 
gaagagggtg 
gagagtgagc 
aagcttgagg 
accatgaagt 
ttcatcctgc 
aagctggtga 
cctccagcat 
taggcctgat 
agttgctaac 
aacgcttttt 
aaaaaaaaaa 



4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6909 



<210> 22 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 22 
tgggacctca agggcatccc 

<210> 23 
<211> 20 
<212> DNA 

<213> Homo sapiens 

<400> 23 
accatgctgc aggatgtgga 

<210> 24 
<211> 20 



20 



20 
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<212> DNA 

<213> Homo sapiens 

<400> 24 

gggaggtgaa ggcaacttca 20 

<210> 25 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 25 

ctcacggggg agaagatgag 20 

<210> 26 
<211> 20 
<212> DNA 

<213> Homo sapiens 
<400> 26 

ctgtggcggc gtttccggtg 20 

<210> 27 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 27 

acatcaagga tcctggatga 20 

<210> 28 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 28 

ctgtggcggc gtttccggtg 20 

<210> 29 

<211> 20 

<212> DNA 

<213> Homo sapiens 



<210> 30 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 30 

aagactgaga gcaaaatccc 20 

<210> 31 

<211> 507 

<212> DNA 

<213> Homo sapiens 



<400> 29 
acagacgtgc attatcgttc 



20 



<400> 31 



tcgaccgccc 
agattacagc 
tgttctcgga 
gcccactgga 
agccagagat 
ggcgcctcgg 
acacgcgcca 



agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 
tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 
acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 
gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 
tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 
ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 
agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 



60 
120 
180 
240 
300 
360 
420 
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caccgacatc agcgatgcct actgctccgc ggtgtttgca ggtaggaggg gccgaccacc 
ctcgccgggg tcggggtggg gtagagg 

<210> 32 

<211> 183 

<212> DNA 

<213> Homo sapiens 

<400> 32 

aaaqqcggga tgtgtctctc cattctccct tttgtgtctc ttgtaggggt gaagaagaga 
accaaagtca tcaagaacag cgtgaaccct gtatggaatg aggtatgtga gtttttctcc 
ttccttttct ctctgtctgc tgcagggggc ttgggaggag gtgccttctc agcagtgtcc 
ttg 

<210> 33 

<211> 264 

<212> DNA 

<213> Homo sapiens 

<400> 33 ^ . 

cattcatgaa tgcctactca gtgccctggt ggcacgaagg tgaaccagac acagtctctt 
ctcctagagg gccataggtt aagatgcctt ttctcttttt cttccaggga tttgaatggg 
acctcaaggg catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg 
agacgatggg gaggaacagg taaggtggcc agaggggggt gctccatggc ttgaaggtgc 
aggtaggatt gtggagtata caga 

<210> 34 

<211> 223 

<212> DNA 

<213> Homo sapiens 

cagaagagc^agggtgcctt aggctagttt tctacatttg acttctctct cctctcaggt 
tcctggggga agccaaggtc ccactccgag aggtcctcgc cacccctagt ctgtccgcca 
gcttcaatgc ccccctgctg gacaccaaga agcagcccac aggggtaagt gcccatcagc 
ctctgccagg ttaaggtcca aggcattgcc aggtggcttc etc 

<210> 35 

<211> 224 

<212> DNA 

<213> Homo sapiens 

<400> 35 

caqtqqtccg aggccagcgc accaacctgt cccccacgtc tcatctcttc caggcctcgc 

taqtcctgea qgtgtcctac acaccgctgc ctggagctgt gcccctgttc ccgcccccta 

ctcctctgga gccctccccg actctgcctg acctggatgt agtggcaggt gggtagecca 

cgttggcctg gctgggcccc agcaagaatg geeggcagtg gcac 

<210> 36 

<211> 315 

<212> DNA 

<213> Homo sapiens 



480 
507 



60 
120 
180 
183 



60 
120 
180 
240 
264 



60 
120 
180 
223 



60 
120 
180 
224 



<400> 36 
aggggcaggg gcagggccag agggccaggc 
gaggagagga agacacagag gaccagggac 
atcaaagcgg aggcccgggg gctcccacca 
cccactaccc cgggatcaaa agaaagcgaa 
acaaaccgca ggatttccag gtgatgaacg 
ccatcagctg egggt 



ctcattaggg 
tcactggaga 
ccccaaggaa 
gtgcgcctac 
ggctttctct 



ccctctcctc 
tgaggcggag 
actaccttca 
atctagaaag 
gaccccaggc 



ttagacacag 
ccattcctgg 
cgtcctccgc 
ctgctgtcag 
tcctcttcag 



60 
120 
180 
240 
300 
315 



<210> 37 

<211> 249 

<212> DNA 

<213> Homo sapiens 
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<400> 37 

ccagtggtga gatggtccct gagatttctg actcttgggg tggatggtgg gtggtcctta 60 

actcttcccc cttctggctt tcagatcagg gtccaggtga tcgaggggcg ccagctgccg 120 

ggggtgaaca tcaagcctgt ggtcaaggtt accgctgcag ggcagaccaa gcggacgcgg 180 

atccacaagg gaaacagccc actcttcaat gaggtgggag acatggggca tgagggcaga 240 
accttgtgg 



249 



<210> 38 

<211> 185 

<212> DNA 

<213> Homo sapiens 

<400> 38 

ccctggcctg agggatcagc aggcactgat atgtctctct ttgctctgaa ccaacagact 60 

cttttcttca acttgtttga ctctcctggg gagctgtttg atgagcccat ctttatcacg 120 

gtatgtctca gcagtcaaag tgttctccgt gggctgtatg tatgcacata ggtgtcagtg 180 

cacac " 185 

<210> 39 

<211> 196 

<212> DNA 

<213> Homo sapiens 

<400> 39 

aagagctatt gggttggccg tgtgggccac atgtccctgt gaatgtgagc catgatcttt 60 

ctctgcaggt ggtagactct cgttctctca ggacagatgc tctcctcggg gagttccggg 120 

taattgctta ttttctaaaa gcagtcagtt ctcacttctc cgtgttggtg gagcctctgt 180 

ggaccatggg cagggg 196 

<210> 40 

<211> 178 

<212> DNA 

<213> Homo sapiens 

<400> 40 

tggaatcgta taatgcacca cactttattt aacgctttgg cggcaagagt ttgatttgtg 60 

tctcctctct tgattgcaga tggacgtggg caccatttac agagagcccc gtgagttctc 120 

accactttgg ccgtatcctt gcattttggt tctggaggct gattggggac actcattt 178 

<210> 41 

<211> 231 

<212> DNA 

<213> Homo sapiens 

<400> 41 

ggggtcttct gattctggga tcaccaaagg atgttgtctc tcttagggca cgcctatctc 60 

aggaagtggc tgctgctctc agaccctgat gacttctctg ctggggccag aggctacctg 120 

aaaacaagcc tttgtgtgct ggggcctggg gacgaagcgc ctgtgagtac atttccctgg 180 

gtcttcctta cggtccccca cgcggcactt ggttgcggag gcaccaaacc a 2 31 

<210> 42 

<211> 247 

<212> DNA 

<213> Homo sapiens 

<400> 42 

gtcaaaaccc tgtgctcagg agcgcatgaa ggaacgtatt tggttttctt tgtagctgga 60 

gagaaaagac ccctctgaag acaaggagga cattgaaagc aacctgctcc ggcccacagg 120 

cgtagccctg cgaggagccc acttctgcct gaaggtcttc cgggccgagg acttgccgca 180 

gagtgcgtgg ggcgcgccct tgggtgggag gtctgcagga ggctggaggc gcagggctgg 240 

tgggggt " 247 

<210> 43 

<211> 179 

<212> DNA 

<213> Homo sapiens 
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<400> 43 

caggcagtga ctggtgtgtc cctcttccca gtggacgatg ccgtgatgga caacgtgaaa 
cagatctttg gcttcgagag taacaagaag aacttggtgg acccctttgt ggaggtcagc 
tttgcgggga aaatggtaag gagcaaggga gcaggagggt tctctcggga ggggacggg 

<210> 44 

<211> 202 

<212> DNA 

<213> Homo sapiens 

<400> 44 

ccccggggga gcccagagtc cccatggagc tgatcaactt gtcccctccc tgtgtcttct 
agctgtgcag caagatcttg gagaagacgg ccaaccctca gtggaaccag aacatcacac 
tgcctgccat ggtgagcctc ctgtccccag caaacccaag gaggcccctg gggctctggg 
cttcgggagg tccagggctc ct 

<210> 45 

<211> 167 

<212> DNA 

<213> Homo sapiens 

<400> 45 

gggaggggct gttctatctt caaaaggact cttctcccaa cacgcctcta ttccttcctc 

agtttccctc catgtgcgaa aaaatgagga ttcgtatcat agactggtga gttctgagtc 

ttggagtctt tagggcgggc tgtcctgagg gggcgctccc tcagttt 

<210> 46 

<211> 220 

<212> DNA 

<213> Homo sapiens 



60 
120 
179 



60 
120 
180 
202 



60 
120 
167 



<400> 46 

tgtggcctga gttcctttcc tgtgtcaggc cctctctgct cccttgctct ctagggaccg 
cctgactcac aatgacatcg tggctaccac ctacctgagt atgtcgaaaa tctctgcccc 
tggaggagaa atagaaggta tgttccctct tcgttctgcc ctttgacccc ctgtgctctc 
cccccctcta tccagcttac acttctagtt ttgagagttt 



60 
120 
180 
220 



<210> 47 
<211?> 172 
<212> DNA 

<213> Homo sapiens 
<400> 47 

acagcctgtt catgtaaccc gtccttctcc cagccatgcc caccctaacc ccttttccat 
ttctttacgc ttcagaggag cctgcaggtg ctgtcaagcc ttcgaaagcc tcagactgta 
cgttgctgtc accttgggga caaccagggg agtggggcct tgggttttgg ct 

<210> 48 
<211> 200 
<212> DNA 

<213> Homo sapiens 
<400> 48 

ccgacccctc tgattgccac ttgtgtctcc cagtggatga ctacctgggc ttcctcccca 
cttttgggcc ctgctacatc aacctctatg gcagtcccag agagttcaca ggcttcccag 
acccctacac agagctcaac acaggcaagg taagccggct ggagccctgg caagggcagg 
atgccacatg cccaggtggg 



60 
120 
172 



60 
120 
180 
200 



<210> 49 
<211> 217 
<212> DNA 

<213> Homo sapiens 
<400> 49 

cctcccctct gtctcccctg ctccttgtga cctgacctcc ctggcagggg gaaggtgtgg 
cttatcgtgg ccggcttctg ctctccctgg agaccaagct ggtggagcac agtgaacaga 
aggtggagga ccttcctgcg gatgacatcc tccgggtgga ggtgaggggt gtggctctgg 



60 
120 
180 
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gtgggagctg ggcgtcgggg cagggaaggg atggcca 217 

<210> 50 

<211> 269 

<212> DNA 

<213> Homo sapiens 

<400> 50 

agcctgggtg cctttctttg ctcctcccgt gaccctctgg tctactctct gctctcagaa 60 

gtaccttagg aggcgcaagt actccctgtt tgcggccttc tactcagcca ccatgctgca 120 

ggatgtggat gatgccatcc agtttgaggt cagcatcggg aactacggga acaagttcga 180 

catgacctgc ctgccgctgg cctccaccac tcagtacagc cgtgcagtct ttgacggtga 240 

ggcagtgctc ctggctggga ccccgatca 269 

<210> 51 

<211> 225 

<212> DNA 

<213> Homo sapiens 

<400> 51 

actcctggca cagcgctcag gcccgtctct ccattccagg gtgccactac tactacctac 60 

cctggggtaa cgtgaaacct gtggtggtgc tgtcatccta ctgggaggac atcagccata 120 

gaatcgagac tcagaaccag ctgcttggga ttgctgaccg gctggtgagt gaaaacttgc 180 

ccaaagctgc acatgcctat gcatgcacct gctacccccg ctgca 225 

<210> 52 

<211> 227 

<212> DNA 

<213> Homo sapiens 

<400> 52 

gggtccagca tgcaccctct gccctgtggt gacacacctg acccttgcct gcccattcca 60 

caggaagctg gcctggagca ggtccacctg gccctgaagg cgcagtgctc cacggaggac 120 

gtggactcgc tggtggctca gctgacggat gagctcatcg caggctgcag gtagggggga 180 

cctggcgccc ctggtgccca cctctcctgg ctcaactggg cctgttt 227 

<210> 53 

<211> 303 

<212> DNA 

<213> Homo sapiens 

<400> 53 

tgggagaccc tgggctcatc aggcgcattc catctgtccg tccctcacag ccagcctctg 60 

ggtgacatcc atgagacacc ctctgccacc cacctggacc agtacctgta ccagctgcgc 120 

acccatcacc tgagccaaat cactgaggct gccctggccc tgaagctcgg ccacagtgag 180 

ctccctgcag ctctggagca ggcggaggac tggctcctgc gtctgcgtgc cctggcagag 240 

gaggtaatta agcctggggg tgcctttctt cttctgctct cctgctgcct ggaacatcag 300 

aac ^ 3Q3 

<210> 54 

<211> 272 

<212> DNA 

<213> Homo sapiens 

<400> 54 

cgtgggcctg gtgtgtcacc atccccaccc cgaccaccac cctctgttca gccccagaac 60 

agcctgccgg acatcgtcat ctggatgctg cagggagaca agcgtgtggc ataccagcgg 120 

gtgcccgccc accaagtcct cttctcccgg cggggtgcca actactgtgg caagaattgt 180 

gggaagctac agacaatctt tctgaaagtg agttttcttt ttccaagtca tgatcgtatt 240 

tccaacataa ggcctttctc ccatctcttg ct 272 



<210> 55 

<211> 219 

<212> DNA 

<213> Homo sapiens 
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<400> 55 

tqtqqqtttc tgtccttctt cggtacccag tatccgatgg agaaggtgcc tggcgcccgg 
atgccagtgc agatacgggt caagctgtgg tttgggctct ctgtggatga gaaggagttc 
aaccagtttg ctgaggggaa gctgtctgtc tttgctgaaa ccgtgagtac ctgccagccc 
ccacctctgc ctcccactac ctggagctgc cttggcccc 

<210> 56 

<211> 292 

<212> DNA 

<213> Homo sapiens 



<400> 56 

tgcctcccac tacctggagc tgccttggcc 
agtatgagaa cgagactaag ttggcccttg 
accccaagtt ttctgacgtc acgggcaaga 
cggccggctg gacctgggct ggagattggt 
gcagggaggg ctggggagag ccaggccagg 



cccttcacgc 
ttgggaactg 
tcaagctacc 
tcgtgtgtcc 
ctgcccacca 



ctcattcttc 
gggcacaacg 
caaggacagc 
ggagaagacg 
tggactgcac 



ctggccctcc 
ggcctcacct 
ttccgcccct 
tgagtcgtgg 



cc 



<210> 57 

<211> 242 

<212> DNA 

<213> Homo sapiens 

<400> 57 t . . 

tqqatggggg cctctccagc agagcagcag agactctgac cagccctcct ccacagtctg 
ctccatgaca tggacgccgg tcacctgagc ttcgtggaag aggtgtttga gaaccagacc 
cagcttcccg gaggccagtg gatctacatg agtgacaact acaccgatgt ggtaaagcag 
gcactcaggg gcaggtgggg tctagacatt tggtctctgg aggcacctgg tgctcaggga 
ca 



60 
120 
180 
219 



60 
120 
180 
240 
292 



60 
120 
180 
240 
242 



<210> 58 

<211> 215 

<212> DNA 

<213> Homo sapiens 

<400> 58 

tcacatctgt ctgtctcctc tcattgcttg cctgttcggt tttgtcctta gaacggggag 
aagqtgcttc ccaaggatga cattgagtgc ccactgggct ggaagtggga agatgaggaa 
tggtccacag acctcaaccg ggctgtcgat gagcaaggtg ggcagcatgt ggaacctggc 
gagccccatc cccggcaagc tctcaagcca tgcat 

<210> 59 

<211> 246 

<212> DNA 

<213> Homo sapiens 



tgccaagcaa tgagtgaccg gttccccctc 
ccccggagcg gaagccgaag cactgggtcc 
ggcggcgctg ggtgcgcctg cgcaggaggg 
gtgagccagc aggtggtggg tgggagtgag 



<400> 59 

agagatggtc ccaggagaga tggggggaag 
ccccaggctg ggagtatagc atcaccatcc 
ctgctgagaa gatgtactac acacaccgac 
atctcagcca aatggaagca ctgaaaaagg 
gcctgt 

<210> 60 

<211> 253 

<212> DNA 

<213> Homo sapiens 

<400> 60 

cttcccaccg gcctctgagt ctgccccttc ttgtgcagca caggcaggcg gaggcggagg 

qcgagggctg ggagtacgcc tctctttttg gctggaagtt ccacctcgag taccgcaaga 

cagatgcctt ccgccgccgc cgctggcgcc gtcgcatgga gccactggag aagacggggc 

ctgcagctgt gtttgccctt gagggggccc tggtatgtgg ggctgcactt gtcctggctt 
gggtagggta tat 

<210> 61 
<211> 177 



60 
120 
180 
215 



60 
120 
180 
240 
246 



60 
120 
180 
240 
253 
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<212> DNA 

<213> Homo sapiens 



<400> 61 

gaatctgcca taaccagctt cgtgtctcca gggcggcgtg atggatgaca agagtgaaga 60 

ttccatgtcc gtctccacct tgagcttcgg tgtgaacaga cccacgattt cctgcatatt 120 

cgactgtaag taggcttcga ggcctctatg gggtgataag ggtgtgtcac cttatgc 177 

<210> 62 

<211> 181 

<212> DNA 

<213> Homo sapiens 

<400> 62 

aaccactcca gccactcact ctggcacctc tgttttttcc cttggtgaag atgggaaccg 60 

ctaccatcta cgctgctaca tgtaccaggc ccgggacctg gctgcgatgg acaaggactc 120 

tttttctggt aggtgggaga gaggcaggag agtcagagac tgtgggctga gatctgggaa 180 

t " . - - . 181 

<210> 63 

<211> 319 

<212> DNA 

<213> Homo sapiens 

<400> 63 

ccccacatgg ctctggagaa gacatctctc agggtccctg ctgtgtaatg tctcccctcc 60 

ccctctggcc atgcagatcc ctatgccatc gtctccttcc tgcaccagag ccagaagacg 120 

g-tggtggtga agaacaccct taaccccacc tgggaccaga cgctcatctt ctacgagatc 180 

gagatctttg gcgagccggc cacagttgct gagcaaccgc ccagcattgt ggtggagctg 240 

tacgaccatg acacttatgt gagtctgccc agctcctgcc tcgtcccctc acagggaggg 300 

accatgtgca aaggtgggg 319 

<210> 64 

<211> 249 

<212> DNA 

<213> Homo sapiens 

<400> 64 

gccctgggta agggatgctg attcttgtct ctctacgctt ggtctagggt gcagacgagt 60 

ttatgggtcg ctgcatctgt caaccgagtc tggaacggat gccacggctg gcctggttcc 120 

cactgacgag gggcagccag ccgtcggggg agctgctggc ctcttttgag ctcatccaga 180 

gagagaaggt gaggctggtc tatatccaga tccaggaggc ccaggcagga gtggggtggg 240 

ggccaaccc 249 

<210> 65 

<211> 158 

<212> DNA 

<213> Homo sapiens 

<400> 65 

cactgacata gtccatgagt gtcatgaggg tgatgggggc cttaggtgac aagcacatga 60 

ccagagctct cttttcttca ctccagccgg ccatccacca tattcctggt tttgaggtaa 120 

gtcttgctct gacctttcct tcttcaaact gattgcca 158 

<210> 66 

<211> 132 

<212> DNA 

<213> Homo sapiens 

<400> 66 

ctttttcccc ttccaacccc tctcaccatc tcctgatgtg cacatcccat ggctgtgggc 60 

caggtgcagg agacatcaag gatcctggat gaggtgagct ggcggggccg aggtagaggg 120 

aaggtgaagc ca 132 

<210> 67 
<211> 216 
<212> DNA 
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<213> Homo sapiens 

<400> 67 cn 
tcttccttcc acctttgtct ccattctacc tgctgtccac tgcagtctga ggacacagac 60 

ctgccctacc caccacccca gagggaggcc aacatctaca tggttcctca gaacatcaag 120 

ccagcgctcc agcgtaccgc catcgaggtg agccgtccgg gcctgggcgt gggggctggg 180 

agcagcctgc ccttcccctt cctggcccca gccttt 2 lb 

<210> 68 

<211> 263 

<212> DNA 

<213> Homo sapiens 

<400> 68 „ 
cccgggcctt ctgagccact ctcctcattc tgtgtgctta gaatcctggc atggggcctg 60 
cggaacatga agagttacca gctggccaac atctcctccc ccagcctcgt ggtagagtgt 120 
aaaqqccaga cggtgcagtc ctgtgtcatc aggaacctcc ggaagaaccc caactttgac 180 
atctgcaccc tcttcatgga agtggtgagc cccacctccc tactgtcccc ttccagagtc 240 
ctggggctag aagttctaca tgt 

<210> 69 

<211> 249 

<212> DNA 

<213> Homo sapiens 

<400> 69 _ ft 
caggccagtg cgttcttcct cctccaccca gatgctgccc agggaggagc tctactgccc 60 
ccccatcacc gtcaaggtca tcgataaccg ccagtttggc cgccggcctg tggtgggcca 120 
gtgtaccatc cgctccctgg agagcttcct gtgtgacccc tactcggcgg agagtccatc 180 
cccacagggt ggcccaggta ggggaagggg agatgatggg caggtcaggg aagggggagc 240 
ctagggcaa 

<210> 70 

<211> 180 

<212> DNA 

<213> Homo sapiens 



<400> 70 _ n 

aggggcgagc cttttgagag agcccctgtc aggcctggat ggctccctcc cctgcagacg 60 

atgtgagcct actcagtcct ggggaagacg tgctcatcga cattgatgac aaggagcccc 120 

tcatccccat ccaggtagga tgggcatcct ccagggaggc ctgggtcacc tttcccctcc 180 



<210> 71 

<211> 211 

<212> DNA 

<213> Homo sapiens 

<400> 71 



60 
120 
180 
211 



<4O0> 71 

tgctgcttgg cgagtcctgt ttctgaaatg gtctctttct ttctacccac tcaggaggaa 
gagttcatcg attggtggag caaattcttt gcctccatag gggagaggga aaagtgcggc 
tcctacctgg agaaggattt tgacaccctg aaggtaaggc ctctcttcag tctgacagtc 
ggtgtgtgtg tgcgtgctgg gcagtgggag a 

<210> 72 

<211> 235 

<212> DNA 

<213> Homo sapiens 

<400> 72 

gttctacttt ctttctgtct cttgtcccct cctctaatcc ccatgtgtgg caggtctatg 60 

acacacagct ggagaatgtg gaggcctttg agggcctgtc tgacttttgt aacaccttca 120 

agctgtaccg gggcaagacg caggaggaga cagaagatcc atctgtgatt ggtgaattta 180 

aggtaaatcc tcgaagacgt ccctaaccca ggtgggccta agactgtggt gttgg 235 

<210> 73 
<211> 268 
<212> DNA 
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<213> Homo sapiens 
<400> 73 

ggggacacag ccaaaccata tcaacaatga tgataaaata aaattaaccc ttccttcttt 60 

tcagggcctc ttcaaaattt atcccctccc agaagaccca gccatcccca tgcccccaag 120 

acagttccac cagctggccg cccagggacc ccaggagtgc ttggtccgta tctacattgt 180 

ccgagcattt ggcctgcagc ccaaggaccc caatggaaag gtaactttct agagccctca 240 

cctccccaga gtagcaggct caggtaca 268 

<210> 74 

<211> 200 

<212> DNA 

<213> Homo sapiens 

<400> 74 

tfctggaaagt gttttcacag aagtgttttg tctcctcctc cagtgtgatc cttacatcaa 60 

gatctccata gggaagaaat cagtgagtga ccaggataac tacatcccct gcacgctgga 120 

gcccgtattt ggaaagtaaa ttggggcatc ttgggtcttg gggtggagga gccagacagg 180 

ataacccaca gtctagtggg 200 

<210> 75 

<211> 263 

<212> DNA 

<213> Homo sapiens 

<400> 75 

cctgttccct tgggtgccct gtgttggctg acattcggga atctgcccct tcctgcagga 60 

tgttcgagct gacctgcact ctgcctctgg agaaggacct aaagatcact ctctatgact 120 

atgacctcct ctccaaggac gaaaagatcg gtgagacggt cgtcgacctg gagaacaggc 180 

tgctgtccaa gtttggggct cgctgtggac tcccacagac ctactgtgtg tacgtggatg 240 

ggggctggct gcctgcttct ctg 263 

<210> 76 

<211> 237 

<212> DNA 

<213> Homo sapiens 

<400> 76 

aagcatctcg tctatgtctt gtgcttgctc ctcagctctg gaccgaacca gtggcgggac 60 

cagctccgcc cctcccagct cctccacctc ttctgccagc agcatagagt caaggcacct 120 

gtgtaccgga cagaccgtgt aatgtttcag gataaagaat attccattga agagataggt 180 

gagctgccac atgaccccaa accatggtgg gctctcgctg tatccctccc tctctca 237 

<210> 77 

<211> 245 

<212> DNA 

<213> Homo sapiens 

<400> 77 

tctctcgctt ccccagctcc tgcaactttt ttgtgttctc tctggggcag aggctggcag 60 

gatcccaaac ccacacctgg gcccagtgga ggagcgtctg gctctgcatg tgcttcagca 120 

gcagggcctg gtcccggagc acgtggagtc acggcccctc tacagccccc tgcagccaga 180 

catcgagcag gtaggacctt acccttggtc ccagagtcct cgaactccag aagcccaacc 240 

ccagg 245 

<210> 78 

<211> 214 

<212> DNA 

<213> Homo sapiens 

<400> 78 

ggtgcttggt aacagctggt taaatgagaa gggtggggag agaacggacc tgtctccgca 60 

ggggaagctg gggaagctgc agatgtgggt cgacctattt ccgaaggccc tggggcggcc 120 

tggacctccc ttcaacatca ccccacggag agccagaagg tgacttccca gccacaggct 180 

ctgagctggg ctgaggggtg gggcgttgca gcct 214 
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<210> 79 

<211> 229 

<212> DNA 

<213> Homo sapiens 

<400> 79 

ttcttaaggc cttcccatcc tttggtagga aatctaggtg gattagagtg atacctttcc 
ccaggttttt cctgcgttgt attatctgga ataccagaga tgtgatcctg gatgacctga 
gcctcacggg ggagaagatg agcgacattt atgtgaaagg gtagggagcc agcgtcctct 
tgcctgtcca gcttcccgca gctcccgtgc tccctctggg ttgtgcaca 

<210> 80 

<211> 261 

<212> DNA 

<213> Homo sapiens 

<400> 80 ^ 
acgatgtata tactgtgttg gaaatcttaa tgagaactat tctctaaaaa catgtatgtc 
tagttggatg attggctttg aagaacacaa gcaaaagaca gacgtgcatt atcgttccct 
qQaaggtgaa ggcaacttca actggaggtt cattttcccc ttcgactacc tgccagctga 
gcaagtctgt accattgcca agaaggtcag tgtccttccg attccctgtg gtgccagcac 
cagggcttct aaagttagcc t 

<210> 81 

<211> 234 

<212> DNA 

<213> Homo sapiens 

<400> 81 

tgcctctctc taactttgct tccttgcatc cttctctgtt cctcttccgg gtcaggatgc 
cttctggagg ctggacaaga ctgagagcaa aatcccagca cgagtggtgt tccagatctg 
ggacaatgac aagttctcct ttgatgattt tctggtgatt ttctgggtaa gcgctattgc 
tagaatccca ttctgcacat gggggctgcc ccagaaccca cactgtgtgt ttat 

<210> 82 

<211> 297 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 
229 



60 
120 
180 
240 
261 



60 
120 
180 
234 



<400> 82 

ggctacaggc tggcagtgat cgagaaaccc ggccaaaaac 
cctgcagctc gatctcaacc gcatgcccaa gccagccaag 
ggaccagctg gatgatgctt tccacccaga atggtttgtg 
agtgaagggc tggtggccct gtgtagcaga agagggtgag 
tctacttcct ccagccccag tggagggcat gggggaagct 



cacctctctg 
acagccaaga 
tccctttttg 
aagaaaatac 
tcttccatag 



ttgcaggctc 
agtgctcctt 
agcagaaaac 
tggcggtaag 
aaattgt 



<210> 83 

<211> 237 

<212> DNA 

<213> Homo sapiens 

<400> 83 

cctggttact ctccaggcca ctgagcagag ccttcgtgcc cctaaccaag tgctctctgt 
cccctcaggg caagctggaa atgaccttgg agattgtagc agagagtgag catgaggagc 
ggcctgctgg ccagggccgg gatgagccca acatgaaccc taagcttgag gacccaaggt 
cagtgcccag cccctgagcc ccaatgccca caggtctggg ggtataggca cagtcca 

<210> 84 

<211> 252 

<212> DNA 

<213> Homo sapiens 

<400> 84 

ccctagtaaa ggatgcccag ttgactccgg gatctcgctt ccaggcgccc cgacacctcc 
ttcctgtggt ttacctcccc atacaagacc atgaagttca tcctgtggcg gcgtttccgg 
tgggccatca tcctcttcat catcctcttc atcctgctgc tgttcctggc catcttcatc 
tacgccttcc cggtgagcag gcctgacgac actgtggtgg gggaactctg ggtctaatgg 



60 
120 
180 
240 
297 



60 
120 
180 
237 



60 
120 
180 
240 
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gggagttcat ca 252 

<210> 85 

<211> 391 

<212> DNA 

<213> Homo sapiens 



<400> 85 

tggctgtgcc tgccccagtg ggatcaccat gggtccctgt ctcctccctc cctccagaac 60 

tatgctgcca tgaagctggt gaagcccttc agctgaggac tctcctgccc tgtagaaggg 120 

gccgtggggt cccctccagc atgggactgg cctgcctcct ccgcccagct cggcgagctc 180 

ctccagacct cctaggcctg attgtcctgc cagggtgggc agacagacag atggaccggc 240 

ccacactccc agagttgcta acatggagct ctgagatcac cccacttcca tcatttcctt 300 

ctcccccaac ccaacgcttt tttggatcag ctcagacata tttcagtata aaacagttgg 360 

aaccacaaaa aaaaaaaaaa aaaaaaaaaa a 391 



<210> 86 
<211> 51 
<212> PRT 
<213> Homo 



sapiens 



<400> 86 

Lvs Lys Arg Thr Lys Val He Lys Asn Ser Val Asn Pro Val Trp Asn 

1 5 10 15 

Glu Glv Phe Glu Trp Asp Leu Lys Gly He Pro Leu Asp Gin Gly Ser 

20 25 30 

Glu Leu His Val Val Val Lys Asp His Glu Thr Met Gly Arg Asn Arg 

35 40 45 

Phe Leu Gly 
50 



<210> 87 
<211> 45 
<212> PRT 
<213> Homo 



sapiens 



<400> 87 

Ser Lvs lie Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin Asn He 

1 5 10 15 

Thr Leu Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg He Arg 

20 25 30 

He He Asp Trp Asp Arg Leu Thr His Asn Asp He Val 
35 40 45 



<210> 88 
<211> 82 
<212> PRT 
<213> Homo 



sapiens 



<400> 88 

Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe Ser Asp Pro 

15 10 15 

Tyr Ala He Val Ser Phe Leu His Gin Ser Gin Lys Thr Val Val Val 

20 25 30 

Lys Asn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu He Phe Tyr Glu 

35 40 45 

He Glu He Phe Gly Glu Pro Ala Thr Val Ala Glu Gin Pro Pro Ser 

50 55 60 

He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala Asp Glu Phe 
65 70 75 80 

Met Gly 



<210> 89 
<211> 79 
<212> PRT 
<213> Homo 



sapiens 
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<400> 89 

He Tvr He Val Arg Ala Phe Gly Leu Gin Pro Lys Asp Pro Asn Gly 

1 5 10 15 

Lys Cys Asp Pro Tyr He Lys He Ser He Gly Lys Lys Ser Val Ser 

20 25 30 

Asp Gin Asp Asn Tyr He Pro Cys Thr Leu Glu Pro Val Phe Gly Lys 

35 40 45 

Met Phe Glu Leu Thr Cys Thr Leu Pro Leu Glu Lys Asp Leu Lys He 

50 55 60 

Thr Leu Tyr Asp Tyr Asp Leu Leu Ser Lys Asp Glu Lys He Gly 
65 70 75 

<210> 90 

<211> 152 

<212> DNA 

<213> Homo sapiens 

acgatgtata^actgtgttg gaaatcttaa tgagaactat tctctaaaaa catgtatgtc 60 
tagttggatg attggctttg aagaacacaa gcaaaagaca gacgtgcatt atcgttccct 120 
gggaggtgaa ggcaacttca actggaggtt ca 

<210> 91 

<211> 56 

<212> DNA 

<213> Homo sapiens 

<400> 91 ^ cfc 

gtcagtgtcc ttccgattcc ctgtggtgcc agcaccaggg cttctaaagt tagcct bb 

<210> 92 

<211> 55 

<212> DNA 

<213> Homo sapiens 

<400> 92 „ 
tgcctctctc taactttgct tccttgcatc cttctctgtt cctcttccgg gtcag 55 

<210> 93 

<211> 68 

<212> DNA 

<213> Homo sapiens 

<400> 93 t _ <- n 

gtaagcgcta ttgctagaat cccattctgc acatgggggc tgccccagaa cccacactgt bu 

gtgtttat 

<210> 94 

<211> 56 

<212> DNA 

<213> Homo sapiens 

<400> 94 

ggctacaggc tggcagtgat cgagaaaccc ggccaaaaac cacctctctg ttgcag so 

<210> 95 

<211> 62 

<212> DNA 

<213> Homo sapiens 

<400> 95 

gtaagtctac ttcctccagc cccagtggag ggcatggggg aagcttcttc catagaaatt 60 
gt 

<210> 96 
<211> 68 
<212> DNA 
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<213> Homo sapiens 
<400> 96 

cctggttact ctccaggcca ctgagcagag ccttcgtgcc cctaaccaag tgctctctgt 60 
cccctcag 68 

<210> 97 

<211> 59 

<212> DNA 

<213> Homo sapiens 

<400> 97 

gtcagtgccc agcccctgag ccccaatgcc cacaggtctg ggggtatagg cacagtcca 59 

<210> 98 

<211> 44 

<212> DNA 

<213> Homo sapiens 

<400> 98 

ccctagtaaa ggatgcccag ttgactccgg gatctcgctt ccag 44 

<210> 99 

<211> 60 

<212> DNA 

<213> Homo sapiens 

<400> 99 

gtgagcaggc ctgacgacac tgtggtgggg gaactctggg tctaatgggg gagttcatca 60 

<210> 100 

<211> 57 

<212> DNA 

<213> Homo sapiens 

<400> 100 

tggctgtgcc tgccccagtg ggatcaccat gggtccctgt ctcctccctc cctccag 57 

<210> 101 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 101 

tctcttctcc tagagggcca tag 23 

<210> 102 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 102 

ctgttcctcc ccatcgtctc atgg 24 

<210> 103 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 103 

gctcctcccg tgaccctctg 20 

<210> 104 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 104 

gggtcccagc caggagcact g 21 

<210> 105 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 105 

cccctctcac catctcctga tgtg 24 

<210> 106 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 106 

tggcttcacc ttccctctac ctcgg 25 

<210> 107 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 107 

tcctttggta ggaaatctag gtgg 24 

<210> 108 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 108 

ggaagctgga caggcaagag g 21 

<210> 109 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 109 

atatactgtg ttggaaatct taatgag t 

<210> 110 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 110 

gctggcacca cagggaatcg g 21 

<210> 111 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 111 

ctttgcttcc ttgcatcctt ctctg 2 5 

<210> 112 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 112 

agcccccatg tgcagaatgg g 21 
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<210> 113 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 113 
ggcagtgatc gagaaacccg g 

<210> 114 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 114 
catgccctcc actggggctg g 

<210> 115 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 115 
ggatgcccag ttgactccgg g 

<210> 116 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 116 
ccccaccaca gtgtcgtcag g 

<210> 117 

<211> 6240 

<212> DNA 

<213> Homo sapiens 



<400> 117 

atgctgaggg tcttcatcct ctatgccgag aacgtccaca cacccgacac cgacatcagc 60 

gatgcctact gctccgcggt gtttgcaggg gtgaagaaga gaaccaaagt catcaagaac 120 

agcgtgaacc ctgtatggaa tgagggattt gaatgggacc tcaagggcat ccccctggac 180 

cagggctctg agcttcatgt ggtggtcaaa gaccatgaga cgatggggag gaacaggttc 240 

ctgggggaag ccaaggtccc actccgagag gtcctcgcca cccctagtct gtccgccagc 300 

ttcaatgccc ccctgctgga caccaagaag cagcccacag gggcctcgct ggtcctgcag 360 

gtgtcctaca caccgctgcc tggagctgtg cccctgttcc cgccccctac tcctctggag 420 

ccctccccga ctctgcctga cctggatgta gtggcagaca caggaggaga ggaagacaca 480 

gaggaccagg gactcactgg agatgaggcg gagccattcc tggatcaaag cggaggcccg 540 

ggggctccca ccaccccaag gaaactacct tcacgtcctc cgccccacta ccccgggatc 600 

aaaagaaagc gaagtgcgcc tacatctaga aagctgctgt cagacaaacc gcaggatttc 660 

cagatcaggg tccaggtgat cgaggggcgc cagctgccgg gggtgaacat caagcctgtg 720 

gtcaaggtta ccgctgcagg gcagaccaag cggacgcgga tccacaaggg aaacagccca 780 

ctcttcaatg agactctttt cttcaacttg tttgactctc ctggggagct gtttgatgag 840 

cccatcttta tcacggtggt agactctcgt tctctcagga cagatgctct cctcggggag 900 

ttccggatgg acgtgggcac catttacaga gagccccggc acgcctatct caggaagtgg 960 

ctgctgctct cagaccctga tgacttctct gctggggcca gaggctacct gaaaacaagc 1020 

ctttgtgtgc tggggcctgg ggacgaagcg cctctggaga gaaaagaccc ctctgaagac 1080 

aaggaggaca ttgaaagcaa cctgctccgg cccacaggcg tagccctgcg aggagcccac 1140 

ttctgcctga aggtcttccg ggccgaggac ttgccgcaga tggacgatgc cgtgatggac 1200 

aacgtgaaac agatctttgg cttcgagagt aacaagaaga acttggtgga cccctttgtg 1260 

gaggtcagct ttgcggggaa aatgctgtgc agcaagatct tggagaagac ggccaaccct 1320 

cagtggaacc agaacatcac actgcctgcc atgtttccct ccatgtgcga aaaaatgagg 1380 

attcgtatca tagactggga ccgcctgact cacaatgaca tcgtggctac cacctacctg 1440 

agtatgtcga aaatctctgc ccctggagga gaaatagaag aggagcctgc aggtgctgtc 1500 

aagccttcga aagcctcaga cttggatgac tacctgggct tcctccccac ttttgggccc 1560 

tgctacatca acctctatgg cagtcccaga gagttcacag gcttcccaga cccctacaca 1620 

gagctcaaca caggcaaggg ggaaggtgtg gcttatcgtg gccggcttct gctctccctg 1680 

gagaccaagc tggtggagca cagtgaacag aaggtggagg accttcctgc ggatgacatc 1740 
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ctccgggtgg 
gccaccatgc 
gggaacaagt 
gtctttgacg 
ctgtcatcct 
attgctgacc 
tccacggagg 
agccagcctc 
taccagctgc 
ggccacagtg 
gccctggcag 
gacaagcgtg 
gccaactact 
gagaaggtgc 
tctgtggatg 
acctatgaga 
taccccaagt 
tcggccggct 
gacatggacg 
cccggaggcc 
cttcccaagg 
acagacctca 
gagcggaagc 
cgctgggtgc 
caggcggagg 
ctcgagtacc 
ctggagaaga 
gatgacaaga 
acgatttcct 
gcccgggacc 
ttcctgcacc 
cagacgctca 
ccgcccagca 
ggtcgctgca 
acgaggggca 
aagccggcca 
gagtctgagg 
gttcctcaga 
ctgcggaaca 
tgtgggggcc 
gacatctgca 
atcaccgtca 
accatccgct 
cagggtggcc 
gatgacaagg 
ttctttgcct 
accctgaagg 
ttttgtaaca 
gtgattggtg 
cccatgcccc 
cgtatctaca 
ccttacatca 
tgcacgctgg 
aaggacctaa 
gagacggtcg 
ccacagacct 
ctcctccacc 
gtaatgtttc 
ccacacctgg 
gtcccggagc 
gggaagctgc 
ttcaacatca 
agagatgtga 
aaaggttgga 
ctgggaggtg 
gagcaagtct 
aaaatcccag 



agaagtacct 
tgcaggatgt 
tcgacatgac 
ggtgccacta 
actgggagga 
ggctggaagc 
acgtggactc 
tgggtgacat 
gcacccatca 
agctccctgc 
aggagcccca 
tggcatacca 
gtggcaagaa 
ctggcgcccg 
agaaggagtt 
acgagactaa 
tttctgacgt 
ggacctgggc 
ccggtcacct 
agtggatcta 
atgacattga 
accgggctgt 
cgaagcactg 
gcctgcgcag 
cggagggcga 
gcaagacaga 
cggggcctgc 
gtgaagattc 
gcatattcga 
tggctgcgat 
agagccagaa 
tcttctacga 
ttgtggtgga 
tctgtcaacc 
gccagccgtc 
tccaccatat 
acacagacct 
acatcaagcc 
tgaagagtta 
agacggtgca 
ccctcttcat 
aggtcatcga 
ccctggagag 
cagacgatgt 
agcccctcat 
ccatagggga 
tctatgacac 
ccttcaagct 
aatttaaggg 
caagacagtt 
ttgtccgagc 
agatctccat 
agcccgtatt 
agatcactct 
tcgacctgga 
actgtgtctc 
tcttctgcca 
aggataaaga 
gcccagtgga 
acgtggagtc 
agatgtgggt 
ccccacggag 
tcctggatga 
tgattggctt 
aaggcaactt 
gtaccattgc 
cacgagtggt 



taggaggcgc 
ggatgatgcc 
ctgcctgccg 
ctactaccta 
catcagccat 
tggcctggag 
gctggtggct 
ccatgagaca 
cctgagccaa 
agctctggag 
gaacagcctg 
gcgggtgccc 
ttgtgggaag 
gatgccagtg 
caaccagttt 
gttggccctt 
cacgggcaag 
tggagattgg 
gagcttcgtg 
catgagtgac 
gtgcccactg 
cgatgagcaa 
ggtccctgct 
gagggatctc 
gggctgggag 
tgccttccgc 
agctgtgttt 
catgtccgtc 
ctatgggaac 
ggacaaggac 
gacggtggtg 
gatcgagatc 
gctgtacgac 
gagtctggaa 
gggggagctg 
tcctggtttt 
gccctaccca 
agcgctccag 
ccagctggcc 
gtcctgtgtc 
ggaagtgatg 
taaccgccag 
cttcctgtgt 
gagcctactc 
ccccatccag 
gagggaaaag 
acagctggag 
gtaccggggc 
cctcttcaaa 
ccaccagctg 
atttggcctg 
agggaagaaa 
tggaaagatg 
ctatgactat 
gaacaggctg 
tggaccgaac 
gcagcataga 
atattccatt 
ggagcgtctg 
acggcccctc 
cgacctattt 
agccagaagg 
cctgagcctc 
tgaagaacac 
caactggagg 
caagaaggat 
gttccagatc 



aagtactccc 
atccagtttg 
ctggcctcca 
ccctggggta 
agaatcgaga 
caggtccacc 
cagctgacgg 
ccctctgcca 
atcactgagg 
caggcggagg 
ccggacatcg 
gcccaccaag 
ctacagacaa 
cagatacggg 
gctgagggga 
gttgggaact 
atcaagctac 
ttcgtgtgtc 
gaagaggtgt 
aactacaccg 
ggctggaagt 
ggctgggagt 
gagaagatgt 
agccaaatgg 
tacgcctctc 
cgccgccgct 
gcccttgagg 
tccaccttga 
cgctaccatc 
tctttttctg 
gtgaagaaca 
tttggcgagc 
catgacactt 
cggatgccac 
ctggcctctt 
gaggtgcagg 
ccaccccaga 
cgtaccgcca 
aacatctcct 
atcaggaacc 
ctgcccaggg 
tttggccgcc 
gacccctact 
agtcctgggg 
gaggaagagt 
tgcggctcct 
aatgtggagg 
aagacgcagg 
atttatcccc 
gccgcccagg 
cagcccaagg 
tcagtgagtg 
ttcgagctga 
gacctcctct 
ctgtccaagt 
cagtggcggg 
gtcaaggcac 
gaagagatag 
gctctgcatg 
tacagccccc 
ccgaaggccc 
tttttcctgc 
acgggggaga 
aagcaaaaga 
ttcattttcc 
gccttctgga 
tgggacaatg 



tgtttgcggc 

aggtcagcat 

ccactcagta 

acgtgaaacc 

ctcagaacca 

tggccctgaa 

atgagctcat 

cccacctgga 

ctgccctggc 

actggctcct 

tcatctggat 

tcctcttctc 

tctttctgaa 

tcaagctgtg 

agctgtctgt 

ggggcacaac 

ccaaggacag 

cggagaagac 

ttgagaacca 

atgtgaacgg 

gggaagatga 

atagcatcac 

actacacaca 

aagcactgaa 

tttttggctg 

ggcgccgtcg 

gggccctggg 

gcttcggtgt 

tacgctgcta 

atccctatgc 

cccttaaccc 

cggccacagt 

atggtgcaga 

ggctggcctg 

ttgagctcat 

agacatcaag 

gggaggccaa 

tcgagatcct 

cccccagcct 

tccggaagaa 

aggagctcta 

ggcctgtggt 

cggcggagag 

aagacgtgct 

tcatcgattg 

acctggagaa 

cctttgaggg 

aggagacaga 

tcccagaaga 

gaccccagga 

accccaatgg 

accaggataa 

cctgcactct 

ccaaggacga 

ttggggctcg 

accagctccg 

ctgtgtaccg 

aggctggcag 

tgcttcagca 

tgcagccaga 

tggggcggcc 

gttgtattat 

agatgagcga 

cagacgtgca 

ccttcgacta 

ggctggacaa 

acaagttctc 



cttctactca 

cgggaactac 

cagccgtgca 

tgtggtggtg 

gctgcttggg 

ggcgcagtgc 

cgcaggctgc 

ccagtacctg 

cctgaagctc 

gcgtctgcgt 

gctgcaggga 

ccggcggggt 

atatccgatg 

gtttgggctc 

ctttgctgaa 

gggcctcacc 

cttccgcccc 

tctgctccat 

gacccggctt 

ggagaaggtg 

ggaatggtcc 

catccccccg 

ccgacggcgg 

aaggcacagg 

gaagttccac 

catggagcca 

cggcgtgatg 

gaacagaccc 

catgtaccag 

catcgtctcc 

cacctgggac 

tgctgagcaa 

cgagtttatg 

gttcccactg 

ccagagagag 

gatcctggat 

catctacatg 

ggcatggggc 

cgtggtagag 

ccccaacttt 

ctgccccccc 

gggccagtgt 

tccatcccca 

catcgacatt 

gtggagcaaa 

ggattttgac 

cctgtctgac 

agatccatct 

cccagccatc 

gtgcttggtc 

aaagtgtgat 

ctacatcccc 

gcctctggag 

aaagatcggt 

ctgtggactc 

cccctcccag 

gacagaccgt 

gatcccaaac 

gcagggcctg 

catcgagcag 

tggacctccc 

ctggaatacc 

catttatgtg 

ttatcgttcc 

cctgccagct 

gactgagagc 

ctttgatgat 



1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 
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tttctgggct ccctgcagct cgatctcaac cgcatgccca agccagccaa gacagccaag 5820 

aagtgctcct tggaccagct ggatgatgct ttccacccag aatggtttgt gtcccttttt 5880 

gagcagaaaa cagtgaaggg ctggtggccc tgtgtagcag aagagggtga gaagaaaata 5940 

ctggcgggca agctggaaat gaccttggag attgtagcag agagtgagca tgaggagcgg 6000 

cctgctggcc agggccggga tgagcccaac atgaacccta agcttgagga cccaaggcgc 6060 

cccgacacct ccttcctgtg gtttacctcc ccatacaaga ccatgaagtt catcctgtgg 6120 

cggcgtttcc ggtgggccat catcctcttc atcatcctct tcatcctgct gctgttcctg 6180 

gccatcttca tctacgcctt cccgaactat gctgccatga agctggtgaa gcccttcagc 6240 

<210> 118 

<211> 13 

<212> DNA 

<213> Homo sapiens 



<400> 118 
cgcaagcatg ctg 



13 



<210> 119 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 119 

gagacgatgg gg 12 

<210> 120 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 120 

gatctaaccc tgctgctcac c 21 

<210> 121 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 121 

ctggtgtgtt gcagagcgct g 21 

<210> 122 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 122 

cctctcttct gctgtcttca g 21 

<210> 123 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 123 

tgtgtctggt tcaccttcgt g .21 

<210> 124 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 124 

tccaaataga aatgcctgaa c 21 

<210> 125 
<211> 21 
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<212> DNA 

<213> Homo sapiens 

<400> 125 
aggtatcacc tccaagtgtt g 

<210> 126 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 126 
taccagcttc agagctccct g 

<210> 127 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 127 
ttgatcaggg tgctcttgg 

<210> 128 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 128 
ggagaattgc tztgaacccag 

<210> 129 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 129 
tggctaatga tgttgaacat tt 

<210> 130 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 130 
gacccacaag cggcgcctcg g 

<210> 131 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 131 
gaccccggcg agggtggtcg g 

<210> 132 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 132 
tgtctctcca ttctcccttt tgtg 

<210> 133 

<211> 24 

<212> DNA 

<213> Homo sapiens 
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21 



21 



19 



20 



22 



21 



21 
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<400> 133 
aggacactgc tgagaaggca cctc 

<210> 134 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 134 
agtgccctgg tggcacgaag g 

<210> 135 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 135 
cctacctgca ccttcaagcc atgg 

<210> 136 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 136 
cagaagagcc agggtgcctt agg 

<210> 137 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 137 
ccttggacct taacctggca gagg 

<210> 138 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 138 
cgaggccagc gcaccaacct g 

<210> 139 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 139 
actgccggcc attcttgctg gg 

<210> 140 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 140 
ccaggcctca ttagggccct c 

<210> 141 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 141 
ctgaagagga gcctggggtc ag 
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<210> 142 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 142 
ctgagatttc tgactcttgg ggtg 

<210> 143 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 143 
aaggttctgc cctcatgccc catg 

<210> 144 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 144 
ctggcctgag ggatcagcag g 

<210> 145 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 145 
gtgcatacat acagcccacg gag 

<210> 146 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 146 
gagctattgg gttggccgtg tggg 

<210> 147 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 147 
accaacacgg agaagtgaga actg 

<210> 148 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 148 
ccacacttta tttaacgctt tggcgg 

<210> 149 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 149 
cagaaccaaa atgcaaggat acgg 

<210> 150 
<211> 25 
<212> DNA 
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24 



24 



21 
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<213> Homo sapiens 
<400> 150 

cttctgattc tgggatcacc aaagg 25 

<210> 151 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 151 

ggaccgtaag gaagacccag gg 22 

<210> 152 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 152 

cctgtgctca ggagcgcatg aagg 24 

<210> 153 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 153 

gcagacctcc cacccaaggg eg 22 

<210> 154 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 154 

gagacagatg ggggacagtc aggg 24 

<210> 155 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 155 

cctcccgaga gaaccctcct g 21 

<210> 156 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 156 

gggagcccag agtccccatg g 21 

<210> 157 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 157 

gggcctcctt gggtttgctg g 21 



<210> 158 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 158 
gcctccccag catcctgccg g 

<210> 159 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 159 
tcactgagcc gaatgaaact gagg 

<210> 160 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 160 

tgtggcctga gttcctttcc tgtg 24 

<210> 161 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 161 
ggtcaaaggg cagaacgaag aggg 

<210> 162 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 162 
cccgtccttc tcccagccat g 

<210> 163 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 163 
ctcccctggt tgtccccaag g 

<210> 164 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 164 

cgacccctct gattgccact tgtg 24 

<210> 165 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 165 

ggcatcctgc ccttgccagg g 21 

<210> 166 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 166 

tctgtctccc ctgctccttg 20 
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<210> 167 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 167 
cttccctgcc ccgacgccca g 

<210> 168 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 168 
cagcgctcag gcccgtctct c 

<210> 169 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 169 
tgcataggca tgtgcagctt tggg 

<210> 170 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<4O0> 170 
catgcaccct ctgccctgtg g 

<210> 171 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 171 
agttgagcca ggagaggtgg g 

<210> 172 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 172 
catcaggcgc attccatctg tccg 

<210> 173 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 173 
agcaggagag cagaagaaga aagg 

<210> 174 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 174 
gtgtgtcacc atccccaccc eg 

<210> 175 
<211> 25 
<212> DNA 
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<213> Homo sapiens 

<400> 175 
caagagatgg gagaaaggcc ttatg 

<210> 176 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 176 
ctgggacatc cggatcctga agg 

<210> 177 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 177 
tccaggtagt gggaggcaga gg 

<210> 178 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 178 
tcccactacc tggagctgcc ttgg 

<210> 179 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<40O> 179 
ggctctcccc agccctccct g 

<210> 180 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 180 
cagagcagca gagactctga ccag 

<210> 181 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 181 
tagaccccac ctgcccctga g 

<210> 182 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 182 

tcctctcatt gcttgcctgt tcgg 24 

<210> 183 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 183 

ttgagagctt gccggggatg g 21 

<210> 184 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 184 

aagtgccaag caatgagtga ccgg 24 

<210> 185 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 185 

ctcactccca cccaccacct g 21 

<210> 186 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 186 

cccaccggcc tctgagtctg c 21 

<210> 187 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 187 

accctaccca agccaggaca agtg 24 

<210> 188 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 188 

gaatctgcca taaccagctt cgtg 24 

<210> 189 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 189 

tatcacccca tagaggcctc gaag 24 

<210> 190 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 190 

cagccactca ctctggcacc tctg 24 

<210> 191 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 191 

agcccacagt ctctgactct cctg 24 
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<210> 192 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 192 
acatctctca gggtccctgc tgtg 

<210> 193 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 193 
cctgtgaggg gacgaggcag g 

<210> 194 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 194 
gccctgggta agggatgctg attc 

<210> 195 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 195 
cctgcctggg cctcctggat c 

<210> 196 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 196 
gagggtgatg ggggccttag g 

<210> 197 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 197 
gcaatcagtt tgaagaagga aagg 

<210> 198 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 198 
cacctttgtc tccattctac ctgc 

<210> 199 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 199 
ctcccagccc ccacgcccag g 

<210> 200 
<211> 24 
<212> DNA 
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<213> Homo sapiens 
<400> 200 

ctgagccact ctcctcattc tgtg 24 

<210> 201 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 201 

tggaagggga cagtagggag g 21 

<210> 202 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 202 

ggccagtgcg ttcttcctcc tc 22 

<210> 203 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 203 

tccctgacct gcccatcatc tc 22 

<210> 204 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 204 

gcccctgtca ggcctggatg g 21 

<210> 205 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 205 

tgacccaggc ctccctggag g 21 

<210> 206 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 206 

ctgaaatggt ctctttcttt ctac 24 

<210> 207 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 207 

cacaccgact gtcagactga agag 24 

<210> 208 

<211> 24 

<212> DNA 

<213> Homo sapiens 
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<400> 208 24 
ttgtcccctc ctctaatccc catg 

<210> 209 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 209 21 
gggttaggga cgtcttcgag g 

<210> 210 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 210 22 
cagccaaacc atatcaacaa tg 

<210> 211 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 211 21 
ctggggaggt gagggctcta g 

<210> 212 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 212 21 
gaagtgtttt gtctcctcct c 

<210> 213 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 213 2Q 
gcaggcagcc agcccccatc 

<210> 214 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 214 21 
gggtgccctg tgttggctga c ^ J 

<210> 215 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 215 2Q 
gcaggcagcc agcccccatc 

<210> 216 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 216 
ctcgtctatg tcttgtgctt gctc 
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<210> 217 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 217 
caccatggtt tggggtcatg tgg 

<210> 218 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 218 
tctcgcttcc ccagctcctg c 

<210> 219 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 219 
tctggagttc gaggactctg gg 

<210> 220 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 220 
agaagggtgg ggagagaacg g 

<210> 221 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 221 
cagctcagag cctgtggctg g 

<210> 222 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 222 
aaggccttcc catcctttgg tagg 

<210> 223 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 223 
acaacccaga gggagcacgg g 

<210> 224 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 224 
gttgacgatg tatatactgt gttgg 

<210> 225 
<211> 25 
<212> DNA 
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<213> Homo sapiens 
<400> 225 

gcctctctct aactttgctt ccttg 25 

<210> 226 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 226 

ggctacaggc tggcagtgat cgag 24 

<210> 227 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 227 

ttcccccatg ccctccactg g 21 

<210> 228 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 228 

agccttcgtg cccctaacca agtg 24 

<210> 229 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 229 

ctgtgggcat tggggctcag g 21 

<210> 230 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 230 

gccccagtgg gatcaccatg 20 

<210> 231 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 231 

atgctggagg ggaccccacg g 21 

<210> 232 

<211> 3671 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (418) ♦ . . (3381) 
<400> 232 

tcctggttca agcgattctc tggcctcagc ctcccgagta gctgggatta caggcatgct 60 
ccaccaagcc cgggtaattt tgtattttta atagagacgg ggttttgcca tgttggtcag 120 
gctggtctcg aactcctgac ctcaggtgat ctgcccacct tggcctccca acgtgctgag 180 
attacaggca tgagtcactg tgcccggcag agatggtcta attcatatga aagaactctg 240 
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aaaaaagtag aaagtgattt tctaaaataa ggtacaaata attaatgtaa gcataatcac 300 
ctaaccttgt ggaatttttt ttttttgaga agcaaattgc aaatttgtga tagatctaaa 360 
ggagattgac taagagggtg accatctgga aatgacgtca tgtgagaatg gttaaag atg 420 

Met 
1 

etc ggg aga ttg age eta gag aaa gga aga ttt gtg aac cca gga ggc 468 
Leu Gly Arg Leu Ser Leu Glu Lys Gly Arg Phe Val Asn Pro Gly Gly 
5 10 15 

aga ggt aga gat cca gga gag ggc ggc gtg atg gat gac aag agt gaa 516 
Arq Gly Arg Asp Pro Gly Glu Gly Gly Val Met Asp Asp Lys Ser Glu 
20 25 30 



gat tec atg tec gtc tec acc ttg age ttc ggt gtg aac aga ccc acg 
Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly Val Asn Arg Pro Thr 
35 40 45 



ctg ctg gee tct ttt gag etc ate cag aga gag aag ccg gee ate cac 
Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu Lys Pro Ala He His 
180 185 190 

cat att cct ggt ttt gag gtg cag gag aca tea agg ate ctg gat gag 
His He Pro Gly Phe Glu Val Gin Glu Thr Ser Arg He Leu Asp Glu 
195 200 205 

tct gag gac aca gac ctg ccc tac cca cca ccc cag agg gag gec aac 
Ser Glu Asp Thr Asp Leu Pro Tyr Pro Pro Pro Gin Arg Glu Ala Asn 
210 215 220 225 

ate tac atg gtt cct cag aac ate aag cca gcg etc cag cgt acc gee 
He Tyr Met Val Pro Gin Asn He Lys Pro Ala Leu Gin Arg Thr Ala 
230 235 240 



564 



att tec tgc ata ttc gac tat ggg aac cgc tac cat eta cgc tgc tac 612 
He Ser Cys He Phe Asp Tyr Gly Asn Arg Tyr His Leu Arg Cys Tyr 
50 55 60 65 



atg tac cag gee egg gac ctg get gcg atg gac aag gac tct ttt tct 660 
Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe Ser 
70 75 80 

gat ccc tat gee ate gtc tec ttc ctg cac cag age cag aag acg gtg 708 
Asp Pro Tyr Ala He Val Ser Phe Leu His Gin Ser Gin Lys Thr Val 
85 90 95 

gtg gtg aag aac acc ctt aac ccc acc tgg gac cag acg etc ate ttc 756 
Val Val Lys Asn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu He Phe 
100 105 HO 

tac gag ate gag ate ttt ggc gag ccg gec aca gtt get gag caa ccg 804 
Tyr Glu He Glu He Phe Gly Glu Pro Ala Thr Val Ala Glu Gin Pro 
115 120 125 

ccc age att gtg gtg gag ctg tac gac cat gac act tat ggt gca gac 
Pro Ser He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala Asp 
130 135 140 145 

gag ttt atg ggt cgc tgc ate tgt caa ccg agt ctg gaa egg atg cca 
Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser Leu Glu Arg Met Pro 
150 155 160 

egg ctg gee tgg ttc cca ctg acg agg ggc age cag ccg teg ggg gag 948 
Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser Gin Pro Ser Gly Glu 
165 170 175 



852 



900 



996 



1044 



1092 
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ate gag ate ctg gca tgg ggc ctg egg aac atg aag agt tac cag ctg 
lie Glu He Leu Ala Trp Gly Leu Arg Asn Met Lys Ser Tyr Gin Leu 
245 250 255 

occ aac ate tec tec ccc age etc gtg gta gag tgt ggg ggc cag acg 
Ala Asn He Ser Ser Pro Ser Leu Val Val Glu Cys Gly Gly Gin Thr 
260 265 270 

gtg cag tec tgt gtc ate agg aac etc egg aag aac ccc aac ttt gac 
Val Gin Ser Cys Val He Arg Asn Leu Arg Lys Asn Pro Asn Phe Asp 
275 280 285 

ate tgc ace etc ttc atg gaa gtg atg ctg ccc agg gag gag etc tac 
He Cys Thr Leu Phe Met Glu Val Met Leu Pro Arg Glu Glu Leu Tyr 
290 295 300 305 

tac ccc ccc ate ace gtc aag gtc ate gat aac cgc cag ttt ggc cgc 
Cvs Pro Pro He Thr Val Lys Val He Asp Asn Arg Gin Phe Gly Arg 
y 310 315 320 

cqq cct gtg gtg ggc cag tgt ace ate cgc tec ctg gag age ttc ctg 
Arc Pro Val Val Gly Gin Cys Thr He Arg Ser Leu Glu Ser Phe Leu 
325 "* 330 335 

tgt gac ccc tac teg gcg gag agt cca tec cca cag ggt ggc cca gac 
Cvs Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro Gin Gly Gly Pro Asp 
340 345 350 

qat gtg age eta etc agt cct ggg gaa gac gtg etc ate gac att gat 
Asp Val Ser Leu Leu Ser Pro Gly Glu Asp Val Leu He Asp He Asp 
r 355 360 365 

aac aaq gag ccc etc ate ccc ate cag gag gaa gag ttc ate gat tgg 
Aso Lvs Glu Pro Leu He Pro He Gin Glu Glu Glu Phe He Asp Trp 
370 375 380 385 

tgg age aaa ttc ttt gee tec ata ggg gag agg gaa aag tgc ggc tec 
Trp Ser Lys Phe Phe Ala Ser He Gly Glu Arg Glu Lys Cys Gly Ser 
* 390 395 400 

tac ctg gag aag gat ttt gac ace ctg aag gtc tat gac aca cag ctg 
Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val Tyr Asp Thr Gin Leu 
405 410 415 

qaq aat gtg gag gee ttt gag ggc ctg tct gac ttt tgt aac ace ttc 
Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp Phe Cys Asn Thr Phe 
420 425 430 

aaq ctg tac egg ggc aag acg cag gag gag aca gaa gat cca tct gtg 
Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr Glu Asp Pro Ser Val 
435 440 445 

att ggt gaa ttt aag ggc etc ttc aaa att tat ccc etc cca gaa gac 
He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr Pro Leu Pro Glu Asp 
450 455 460 465 

cca gee ate ccc atg ccc cca aga cag ttc cac cag ctg gee gee cag 
Pro Ala He Pro Met Pro Pro Arg Gin Phe His Gin Leu Ala Ala Gin 
470 475 480 

qga ccc cag gag tgc ttg gtc cgt ate tac att gtc cga gca ttt ggc 
Glv Pro Gin Glu Cys Leu Val Arg He Tyr He Val Arg Ala Phe Gly 
485 490 495 

ctg cag ccc aag gac ccc aat gga aag tgt gat cct tac ate aag ate 
Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp Pro Tyr He Lys He 
500 505 510 



1188 
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tec ata ggg aag aaa tea gtg agt gac cag gat aac tac ate ccc tgc 
Ser He Gly Lys Lys Ser Val Ser Asp Gin Asp Asn Tyr He Pro Cys 
515 520 525 

acg ctg gag ccc gta ttt gga aag atg ttc gag ctg acc tgc act ctg 
Thr Leu Glu Pro Val Phe Gly Lys Met Phe Glu Leu Thr Cys Thr Leu 
530 535 540 545 



aca gac cgt gta atg ttt cag gat aaa gaa tat tec att gaa gag ata 
Thr Asp Arg Val Met Phe Gin Asp Lys Glu Tyr Ser He Glu Glu He 
630 635 640 

gag get ggc agg ate cca aac cca cac ctg ggc cca gtg gag gag cgt 
Glu Ala Gly Arg He Pro Asn Pro His Leu Gly Pro Val Glu Glu Arg 
645 650 655 



gga cct ccc ttc aac ate acc cca egg aga gee aga agg ttt ttc ctg 
Gly Pro Pro Phe Asn He Thr Pro Arg Arg Ala Arg Arg Phe Phe Leu 
710 715 720 

cgt tgt att ate tgg aat acc aga gat gtg ate ctg gat gac ctg age 
Arg Cys He He Trp Asn Thr Arg Asp Val He Leu Asp Asp Leu Ser 
725 730 735 



ggc ttt gaa gaa cac aag caa aag aca gac gtg cat tat cgt tec ctg 
Gly Phe Glu Glu His Lys Gin Lys Thr Asp Val His Tyr Arg Ser Leu 
755 760 765 



2004 



2052 



cct ctg gag aag gac eta aag ate act etc tat gac tat gac etc etc 2100 
Pro Leu Glu Lys Asp Leu Lys He Thr Leu Tyr Asp Tyr Asp Leu Leu 
550 555 560 

tec aag gac gaa aag ate ggt gag acg gtc gtc gac ctg gag aac agg 2148 
Ser Lys Asp Glu Lys He Gly Glu Thr Val Val Asp Leu Glu Asn Arg 
565 ~ 570 575 

ctg ctg tec aag ttt ggg get cgc tgt gga etc cca cag acc tac tgt 2196 
Leu Leu Ser Lys Phe Gly Ala Arg Cys Gly Leu Pro Gin Thr Tyr Cys 
580 585 590 

gtc tct gga ccg aac cag tgg egg gac cag etc cgc ccc tec cag etc 2244 
Val Ser Gly Pro Asn Gin Trp Arg Asp Gin Leu Arg Pro Ser Gin Leu 
595 600 605 

etc cac etc ttc tgc cag cag cat aga gtc aag gca cct gtg tac egg 2292 
Leu His Leu Phe Cys Gin Gin His Arg Val Lys Ala Pro Val Tyr Arg 
610 615 620 625 



2340 



2388 



ctg get ctg cat gtg ctt cag cag cag ggc ctg gtc ccg gag cac gtg 2436 
Leu Ala Leu His Val Leu Gin Gin Gin Gly Leu Val Pro Glu His Val 
660 665 670 

gag tea egg ccc etc tac age ccc ctg cag cca gac ate gag cag ggg 2484 
Glu Ser Arg Pro Leu Tyr Ser Pro Leu Gin Pro Asp He Glu Gin Gly 
675 " 680 685 

aag ctg cag atg tgg gtc gac eta ttt ccg aag gee ctg ggg egg cct 2 532 

Lys Leu Gin Met Trp Val Asp Leu Phe Pro Lys Ala Leu Gly Arg Pro 
690 695 700 705 



2580 



2628 



etc acg ggg gag aag atg age gac att tat gtg aaa ggt tgg atg att 2676 
Leu Thr Gly Glu Lys Met Ser Asp He Tyr Val Lys Gly Trp Met He 
740 ** 745 750 



2724 
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aaa aqt gaa ggc aac ttc aac tgg agg ttc att ttc ccc ttc gac tac 
Gly Gly Glu Gly Asn Phe Asn Trp Arg Phe lie Phe Pro Phe Asp Tyr 
770 775 780 785 

ctg cca get gag caa gtc tgt acc att gec aag aag gat gee ttc tgg 
Leu Pro Ala Glu Gin Val Cys Thr He Ala Lys Lys Asp Ala Phe Trp 
790 795 800 

aao eta aac aag act gag age aaa ate cca gca cga gtg gtg ttc cag 
A?I Leu Asp Lyl Thr Glu Ser Lys He Pro Ala Arg Val Val Phe Gin 
* " 805 810 815 

ate tgg gac aat gac aag ttc tec ttt gat gat ttt ctg ggc tec ctg 
He Trp Asp Asn Asp Lys Phe Ser Phe Asp Asp Phe Leu Gly Ser Leu 
820 825 830 

cag etc gat etc aac cgc atg ccc aag cca gec aag aca gee aag aag 
Gin Leu Asp Leu Asn Arg Met Pro Lys Pro Ala Lys Thr Ala Lys Lys 
835 840 845 

tgc tec ttg gac cag ctg gat gat get ttc cac cca gaa tgg ttt gtg 
C?s Ser Leu Asp Gin Leu Asp Asp Ala Phe His Pro Glu Trp Phe Val 
850 " 855 860 865 

tec ctt ttt gag cag aaa aca gtg aag ggc tgg tgg ccc tgt gta gca 
Ser Leu Phe Glu Gin Lys Thr Val Lys Gly Trp Trp Pro Cys Val Ala 
870 875 880 

gaa gag ggt gag aag aaa ata ctg gcg ggc aag ctg gaa atg acc ttg 
Glu Glu Gly Glu Lys Lys He Leu Ala Gly Lys Leu Glu Met Thr Leu 
885 890 895 

gag att gta gca gag agt gag cat gag gag egg cct get ggc cag ggc 
Glu He Val Ala Glu Ser Glu His Glu Glu Arg Pro Ala Gly Gin Gly 
900 905 910 

caa gat gag ccc aac atg aac cct aag ctt gag gac cca agg cgc ccc 
Ara Asp Glu Pro Asn Met Asn Pro Lys Leu Glu Asp Pro Arg Arg Pro 
915 920 925 

gac acc tec ttc ctg tgg ttt acc tec cca tac aag acc atg aag ttc 
Asp Thr Ser Phe Leu Trp Phe Thr Ser Pro Tyr Lys Thr Met Lys Phe 
930 935 940 945 

ate ctg tgg egg cgt ttc egg tgg gec ate ate etc ttc ate ate etc 
He Leu Trp Arg Arg Phe Arg Trp Ala He He Leu Phe He He Leu 
950 955 960 

ttc ate ctg ctg ctg ttc ctg gee ate ttc ate tac gec ttc ccg aac 3348 
Phe He Leu Leu Leu Phe Leu Ala He Phe He Tyr Ala Phe Pro Asn 
965 970 975 

tat get gee atg aag ctg gtg aag ccc ttc age tgaggactct cctgccctgt 3401 
Tyr Ala Ala Met Lys Leu Val Lys Pro Phe Ser 
980 985 

agaaggggee gtggggtccc ctccagcatg ggactggcct gcctcctccg cccagctcgg 3461 

cgagctcctc cagacctcct aggectgatt gtcctgccag ggtgggcaga cagacagatg 3521 

gaccggccca cactcccaga gttgetaaca tggagctctg agatcacccc acttccatca 3581 

tttccttctc ccccaaccca aegctttttt ggatcagctc agacatattt cagtataaaa 3641 

cagttggaac cacaaaaaaa aaaaaaaaaa 

<210> 233 
<211> 988 
<212> PRT 
<213> Homo sapiens 



2772 



2820 



2868 



2916 



2964 



3012 



3060 



3108 



3156 



3204 



3252 



3300 
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<400> 233 

Met Leu Gly Arg Leu Ser Leu Glu Lys Gly Arg Phe Val Asn Pro Gly 

1 5 10 15 

Glv Arq Gly Arg Asp Pro Gly Glu Gly Gly Val Met Asp Asp Lys Ser 

20 25 30 

Glu Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly Val Asn Arg Pro 

35 40 45 

Thr lie Ser Cys lie Phe Asp Tyr Gly Asn Arg Tyr His Leu Arg Cys 

50 * 55 60 

Tyr Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe 
65 70 75 80 

Ser Asp Pro Tyr Ala lie Val Ser Phe Leu His Gin Ser Gin Lys Thr 

85 90 95 

Val Val Val Lys Asn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu lie 

100 105 110 

Phe Tyr Glu lie Glu lie Phe Gly Glu Pro Ala Thr Val Ala Glu Gin 

115 120 125 

Pro Pro Ser He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala 

130 135 140 

Asp Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser Leu Glu Arg Met 
145 150 155 160 

Pro Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser Gin Pro Ser Gly 

165 170 175 

Glu Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu Lys Pro Ala He 

180 185 190 

His His He Pro Gly Phe Glu Val Gin Glu Thr Ser Arg He Leu Asp 

195 200 205 

Glu Ser Glu Asp Thr Asp Leu Pro Tyr Pro Pro Pro Gin Arg Glu Ala 

210 215 220 

Asn He Tyr Met Val Pro Gin Asn He Lys Pro Ala Leu Gin Arg Thr 
225 230 235 240 

Ala He Glu He Leu Ala Trp Gly Leu Arg Asn Met Lys Ser Tyr Gin 

245 250 255 

Leu Ala Asn He Ser Ser Pro Ser Leu Val Val Glu Cys Gly Gly Gin 

260 265 270 

Thr Val Gin Ser Cys Val He Arg Asn Leu Arg Lys Asn Pro Asn Phe 

275 280 285 

Asp He Cys Thr Leu Phe Met Glu Val Met Leu Pro Arg Glu Glu Leu 

290 295 300 

Tyr Cys Pro Pro He Thr Val Lys Val He Asp Asn Arg Gin Phe Gly 
305 310 315 320 

Arq Arg Pro Val Val Gly Gin Cys Thr He Arg Ser Leu Glu Ser Phe 

325 330 335 

Leu Cys Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro Gin Gly Gly Pro 

340 345 350 

Asp Asp Val Ser Leu Leu Ser Pro Gly Glu Asp Val Leu He Asp He 

355 360 365 

Asp Asp Lys Glu Pro Leu He Pro He Gin Glu Glu Glu Phe He Asp 

370 375 380 

Trp Trp Ser Lys Phe Phe Ala Ser He Gly Glu Arg Glu Lys Cys Gly 
385 390 395 400 

Ser Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val Tyr Asp Thr Gin 

405 410 415 

Leu Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp Phe Cys Asn Thr 

420 425 430 

Phe Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr Glu Asp Pro Ser 

435 440 445 

Val He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr Pro Leu Pro Glu 

450 455 460 

Asp Pro Ala He Pro Met Pro Pro Arg Gin Phe His Gin Leu Ala Ala 
465 470 475 480 

Gin Gly Pro Gin Glu Cys Leu Val Arg He Tyr He Val Arg Ala Phe 

485 490 495 

Gly Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp Pro Tyr He Lys 

500 505 510 

He Ser He Gly Lys Lys Ser Val Ser Asp Gin Asp Asn Tyr He Pro 
515 * * 520 525 



BNSDOCID: <WO 001 1 1 57A1_I_> 



WO 00/1 1 1 57 PCT/US99/1 9395 

68/68 

Cvs Thr Leu Glu Pro Val Phe Gly Lys Met Phe Glu Leu Thr Cys Thr 

530 535 540 

Leu Pro Leu Glu Lys Asp Leu Lys He Thr Leu Tyr Asp Tyr Asp Leu 
545 550 555 560 

Leu ser Lys Asp Glu Lys He Gly Glu Thr Val Val Asp Leu Glu Asn 
565 K " "' J 

— ser Lys Phe Gly Ala . ^ , . 

580 585 590 



570 575 



Arg Leu Leu Ser Lys Phe Gly Ala Arg Cys Gly Leu Pro Gin Thr Tyr 

580 585 «U 

Cys Val Ser Gly Pro Asn Gin Trp Arg Asp Gin Leu Arg Pro Ser Gin 

595 600 605 

Leu Leu His Leu Phe Cys Gin Gin His Arg Val Lys Ala Pro Val Tyr 

610 615 620 

Arg Thr Asp Arg Val Met Phe Gin Asp Lys Glu Tyr Ser lie Glu Glu 
625 630 635 b4U 

lie Glu Ala Gly Arg lie Pro Asn Pro His Leu Gly Pro Val Glu Glu 

645 650 655 

Arg Leu Ala Leu His Val Leu Gin Gin Gin Gly Leu Val Pro Glu His 

660 665 o'O 

Val Glu Ser Arg Pro Leu Tyr Ser Pro Leu Gin Pro Asp lie Glu Gin 

675 680 685 

Gly Lys Leu Gin Met Trp Val Asp Leu Phe Pro Lys Ala Leu Gly Arg 

690 695 700 

Pro Gly Pro Pro Phe Asn He Thr Pro Arg Arg Ala Arg Arg Phe Phe 
705 710 715 720 

Leu Arg Cys He He Trp Asn Thr Arg Asp Val He Leu Asp Asp Leu 

725 730 7 -* 5 

Ser Leu Thr Gly Glu Lys Met Ser Asp He Tyr Val Lys Gly Trp Met 

740 745 750 

He Gly Phe Glu Glu His Lys Gin Lys Thr Asp Val His Tyr Arg Ser 

755 760 765 

Leu Gly Gly Glu Gly Asn Phe Asn Trp Arg Phe He Phe Pro Phe Asp 
770 " 775 780 



Tyr Leu Pro Ala Glu Gin Val Cys Thr He Ala Lys Lys Asp Ala Phe 
785 790 795 800 

Trp Arg Leu Asp Lys Thr Glu Ser Lys He Pro Ala Arg Val Val Phe 

805 810 815 

Gin He Trp Asp Asn Asp Lys Phe Ser Phe Asp Asp Phe Leu Gly Ser 

820 825 830 

Leu Gin Leu Asp Leu Asn Arg Met Pro Lys Pro Ala Lys Thr Ala Lys 

835 840 845 

Lys Cys Ser Leu Asp Gin Leu Asp Asp Ala Phe His Pro Glu Trp Phe 

850 855 860 

Val Ser Leu Phe Glu Gin Lys Thr Val Lys Gly Trp Trp Pro Cys Val 
865 870 875 880 

Ala Glu Glu Gly Glu Lys Lys He Leu Ala Gly Lys Leu Glu Met Thr 

885 890 895 

Leu Glu He Val Ala Glu Ser Glu His Glu Glu Arg Pro Ala Gly Gin 

900 905 910 

Gly Arg Asp Glu Pro Asn Met Asn Pro Lys Leu Glu Asp Pro Arg Arg 

915 920 9 25 

Pro Asp Thr Ser Phe Leu Trp Phe Thr Ser Pro Tyr Lys Thr Met Lys 

930 9 35 9 *0 

Phe He Leu Trp Arg Arg Phe Arg Trp Ala lie He Leu Phe He He 
945 950 9 55 9oU 

Leu Phe He Leu Leu Leu Phe Leu Ala He Phe He Tyr Ala Phe Pro 

965 97 0 97 5 

Asn Tyr Ala Ala Met Lys Leu Val Lys Pro Phe Ser 
980 " 9 85 
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(57) Abstract 



A novel gene and the protein encoded therein, i.e., dysferlin, are disclosed. This gene and its expression products are associated with 
muscular dystrophy, e.g., Miyoshi myopathy and limb girdle muscular dystrophy 2B. 
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5 RELATED APPLICATION INFORMATION 

This application claims priority from provisional 
application serial no. 60/097,927, filed August 25, 1998. 

Statement as to Federally Sponsored Research 
The work described herein was supported in part by 
10 NIH grants 5P01AG12992, 5R01N834913A, and 5P01NS31248. 
The Federal Government therefore may have certain rights 
in the invention. 



Background of the Invention 
The invention relates to genes involved in the 
15 onset of muscular dystrophy. 

Muscular dystrophies constitute a heterogeneous 
group of disorders. Most are characterized by weakness 
and atrophy of the proximal muscles, although in rare 
myopathies such as "Miyoshi myopathy" symptoms may first 
20 arise in distal muscles. Of the various hereditary types 
of muscular dystrophy, several are caused by mutations or 
deletions in genes encoding individual components of the 
dystrophin-associated protein (DAP) complex. It is this 
DAP complex that links the cytoskeletal protein 
25 dystrophin to the extracellular matrix protein, laminin- 
2 . 

Muscular dystrophies may be classified according 
to the gene mutations that are associated with specific 
clinical syndromes. For example, mutations in the gene 
30 encoding the cytoskeletal protein dystrophin result in 

either Duchenne's Muscular Dystrophy or Becker's Muscular 
Dystrophy, whereas mutations in the gene encoding the 
extracellular matrix protein merosin produce Congenital 
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Muscular Dystrophy. Muscular dystrophies with an 
autosomal recessive mode of inheritance include "Miyoshi 
myopathy" and the several limb-girdle muscular 
dystrophies (LGMD2) . Of the limb-girdle muscular 
5 dystrophies, the deficiencies resulting in LGMD2C, D, E, 
and F result from mutations in genes encoding the 
membrane-associated sarcoglycan components of the DAP 
complex . 

Summary of the Invention 

10 A novel protein, designated dysferlin, is 

identified and characterized. The dysferlin gene is 
normally expressed in skeletal muscle cells and is 
selectively mutated in several families with the 
hereditary muscular dystrophies, e.g., Miyoshi myopathy 

15 (MM) and limb girdle muscular dystrophy-2B (LGMD2B) . 

These characteristics of dysferlin render it a candidate 
disease gene for both MM and LGMD2B. An additional novel 
protein, brain-specific dysferlin, has also been 
identified. Defects in brain-specific dysferlin may 

20 predispose to selected disorders of the central nervous 
system. Moreover, the expression of brain-specific 
dysferlin may be important as a marker for normal neural 
development (e.g., in vivo or in neural cells in 
culture) . Manipulation of levels of expression of brain- 

25 specific dysferlin, and of the type of expressed brain- 
specific dysferlin is of use for analyzing the function 
of brain- specif ic dysferlin and related dysferlin- 
associated molecules . 

The invention features an isolated DNA which 

3 0 includes a nucleotide sequence hybridizing under 

stringent hybridization conditions to a strand of SEQ ID 
NO: 3 or SEQ ID NO: 117. 
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The invention also features an isolated DNA 
including a nucleotide sequence selected from SEQ ID 
NOs: 4-12 . 

Also within the invention is an isolated DNA 
comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NOs: 22-30. 

Also within the invention is a single stranded 
oligonucleotide of 14-50 nucleotides in length having a 
nucleotide sequence identical to a portion of a strand of 
SEQ ID NO: 3. 

Also within the invention is a pair of PCR primers 
consisting of: 

(a) a first single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the sense 
strand of SEQ ID NO: 117; and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the 
antisense strand of SEQ ID NO: 117, wherein the sequence 
of at least one of the oligonucleotides is identical to a 
portion of a strand of SEQ ID NO : 3 , and the first 
oligonucleotide is not complementary to the second 
oligonucleotide . 

Also within the invention is a pair of single 
stranded oligonucleotides selected from of SEQ ID NOs 
130-231, SEQ ID NO:110, and SEQ ID N0:112. 

Also within the invention is an isolated DNA 
including a nucleotide sequence that encodes a protein 
that shares at least 7 0% sequence identity with SEQ ID 
N0;2, or a complement of the nucleotide sequence. 

Also within the invention is an isolated DNA 
including a nucleotide sequence which hybridizes under 
stringent hybridization conditions to a strand of a 
nucleic acid, the nucleic acid having a sequence selected 
from SEQ ID NOs : 31-79 and 90-101. 
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Also within the invention is a single stranded 
oligonucleotide of 14-50 nucleotides in length having a 
nucleotide sequence which is identical to a portion of a 
strand of a nucleic acid selected from SEQ ID NOs : 31-79 
5 and 90-100 . 

Also within the invention is a pair of PCR primers 
consisting of: 

(a) a first single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the sense 

10 strand of a nucleic acid selected from SEQ ID NOs: 31-85; 
and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides of the 
antisense strand of a nucleic acid selected from SEQ ID 

15 NOs: 31-85, wherein the sequence of at least one of the 
oligonucleotides includes a sequence identical to a 
portion of a strand of a nucleic acid selected from SEQ 
ID NOs: 31-79 and 90-100, and the first oligonucleotide 
is not complementary to the second oligonucleotide. 

20 Also within the invention is a pair of single 

stranded oligonucleotides selected from SEQ ID NOs 101- 
116, SEQ ID NOs 184-185, SEQ ID NOs 188-191, SEQ ID NOs 
210-213, and SEQ ID NOs 216-217. 

Also within the invention is a substantially pure 

25 protein that has an amino acid sequence sharing at least 
70% sequence identity with SEQ ID NO:2. 

Also within the invention is a substantially pure 
protein the sequence of which includes amino acid 
residues 1-500, 501-1000, 1001-1500, or 1501-2080 of SEQ 

3 0 ID NO: 2 . 

Also within the invention is a substantially pure 
protein including the amino acid sequence of SEQ ID 
NO: 86, SEQ ID NO: 87, SEQ ID NO: 88, or SEQ ID NO: 89. 

In another aspect, the invention features a 
3 5 transgenic non-human mammal having a transgene disrupting 
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or interfering with the expression of a dysferlin gene, 
the transgene being chromosomally integrated into the 
germ cells of the animal. 

Another embodiment of the invention features a 
method of decreasing the symptoms of muscular dystrophy 
in a mammal by introducing into a cell of the mammal 
(e.g., a muscle cell or a muscle precursor cell) an 
isolated DNA which hybridizes under stringent 
hybridization conditions to a strand of SEQ ID NO: 3. 

Another aspect of the invention provides a method 
for identifying a patient, a fetus, or a pre-embryo at 
risk for having a dysf erlin-related disorder by (a) 
providing a sample of genomic DNA from the patient, 
fetus, or pre-embryo; and (b) determining whether the 
sample contains a mutation in a dysferlin gene. 

In another aspect, the invention provides a method 
for identifying a patient, a fetus, or a pre-embryo at 
risk for having a dysf erlin-related disorder by (a) 
providing a sample including dysferlin mRNA from the 
patient, fetus, or pre-embryo; and (b) determining 
whether the dysferlin mRNA contains a mutation. 

Methods of identifying mutations in a dysferlin 
sequence are useful for predicting (e.g., predicting 
whether an individual is at risk for developing a 
dysf erlin-related disorder) or diagnosing disorders 
associated with dysferlin, e.g., MM and LGMD2B. Such 
methods can also be used to determine if an individual, 
fetus, or a pre-embryo is a carrier of a dysferlin 
mutation, for example in screening procedures. Methods 
which distinguish between different dysferlin alleles 
(e.g., a mutant dysferlin allele and a normal dysferlin 
allele) can be used to determine carrier status. 

The invention also features an isolated nucleic 
acid comprising a nucleotide sequence which hybridizes 
under stringent hybridization conditions to nucleic acids 
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3284-3720 of SEQ ID NO: 232, or the complement of the 
nucleotide sequence. An isolated nucleic acid including 
a nucleotide sequence identical to the sequence of 
nucleotides 3284-3720 of SEQ ID NO: 232, or a complement 
5 of the nucleotide sequence is also a feature of the 
invention. The isolated nucleic acid can include the 
entire sequence of SEQ ID NO: 232 or the complement of SEQ 
ID NO: 232. 

Another aspect of the invention features an 
10 isolated polypeptide that includes: a) at least 15 
contiguous amino acids of the polypeptide comprising 
amino acids 1-24 of SEQ ID NO: 233, b) a naturally 
occuring allelic variant of a polypeptide comprising 
amino acids 1-24 of SEQ ID NO: 233, or c) an amino 

15 acid sequence which is encoded by a nucleic acid molecule 
which hybridizes under stringent conditions to 
nucleotides 3284-3720 of SEQ ID NO: 232. The polypeptide 
of this aspect can include the entire sequence of SEQ ID 
NO:233 . 

20 Also included in the invention is a vector 

comprising the nucleic acid of claim 44 and a cell that 
contains the vector. Another aspect of the invention 
features a method of making a polypeptide by culturing 
the cell which contains the vector. 

25 The invention also features an antibody which 

specifically binds to a polypeptide of such as those 
described above. The antibody can bind to a polypeptide 
selected from amino acids 253-403 of SEQ ID NO:233, amino 
acids 624-865 of SEQ ID NO:233, and amino acids 1664-1786 

30 of SEQ ID NO:233. Antibodies of the invention can be 
monclonal or polyclonal antibodies. 

An "isolated DNA" is DNA which has a naturally 
occurring sequence corresponding to part or all of a 
given gene but is free of the two genes that normally 

35 flank the given gene in the genome of the organism in 
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which the given gene naturally occurs. The term 
therefore includes a recombinant DNA incorporated into a 
vector, into an autonomously replicating plasmid or 
virus, or into the genomic DNA of a prokaryote or 
5 eukaryote. It also includes a separate molecule such as 
a cDNA, a genomic fragment, a fragment produced by 
polymerase chain reaction (PCR) , or a restriction 
fragment, as well as a recombinant nucleotide sequence 
that is part of a hybrid gene, i.e., a gene encoding a 

10 fusion protein. The term excludes intact chromosomes and 
large genomic segments containing multiple genes 
contained in vectors or constructs such as cosmids, yeast 
artificial chromosomes (YACs) , and Pl-derived artificial 
chromosome (PAC) contigs. 

15 A "noncoding sequence" is a sequence which 

corresponds to part or all of an intron of a gene, or to 
a sequence which is 5' or 3' to a coding sequence and so 
is not normally translated. 

An expression control sequence is "operably 

2 0 linked" to a coding sequence when it is within the same 

nucleic acid and can control expression of the coding 
sequence . 

A "protein" or "polypeptide" is any chain of amino 
acids linked by peptide bonds, regardless of length or 

25 post- translational modification, e.g., glycosylation or 
phosphorylation. 

As used herein, the term "percent sequence 
identity" means the percentage of identical subunits at 
corresponding positions in two sequences when the two 

30 sequences are aligned to maximize subunit matching, i.e., 
taking into account gaps and insertions. For purposes of 
the present invention, percent sequence identity between 
two polypeptides is to be determined using the Gap 
program and the default parameters as specified therein. 

3 5 The Gap program is part of the Sequence Analysis Software 



BNSDOCID: <WO 001 1 157A1_IA> 



WO 00/1 1 1 57 PCT/US99/1 9395 

- 8 - 

Package of the Genetics Computer Group, University of 
Wisconsin Biotechnology Center, 1710 University Avenue, 
Madison, WI 53705. 

The algorithm of Myers and Miller, CABIOS (1989) 
5 can also be used to determine whether two sequences are 
similar or identical. Such an algorithm is incorporated 
into the ALIGN program (version 2.0) which is part of the 
GCG sequence alignment software package. When utilizing 
the ALIGN program for comparing amino acid sequences, a 

10 PAM120 weight residue table, a gap length penalty of 12, 
and a gap penalty of 4 can be used. 

As used herein, the term "stringent hybridization 
conditions" means the following DNA hybridization and 
wash conditions: hybridization at 60°C in the presence 

15 of 6 x SSC, 0.5% SDS, 5 x Denhardt ' s Reagent, and 100 
/xg/ml denatured salmon sperm DNA; followed by a first 
wash at room temperature for 2 0 minutes in 0.5 x SSC and 
0.1% SDS and a second wash at 55°C for 30 minutes in 0.2 
x SSC and 0 . 1% SDS . 

20 A "substantially pure protein" is a protein 

separated from components that naturally accompany it. 
The protein is considered to be substantially pure when 
it is at least 60%, by dry weight, free from the proteins 
and other naturally-occurring organic molecules with 

25 which it is naturally associated. Preferably, the purity 
of the preparation is at least 75%, more preferably at 
least 90%, and most preferably at least 99%, by weight. 
A substantially pure dysferlin protein can be obtained, 
for example, by extraction from a natural source, by 

3 0 expression of a recombinant nucleic acid encoding a 

dysferlin polypeptide, or by chemical synthesis. Purity 
can be measured by any appropriate method, e.g., column 
chromatography, polyacrylamide gel electrophoresis, or 
HPLC analysis. A chemically synthesized protein or a 

35 recombinant protein produced in a cell type other than 
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the cell type in which it naturally occurs is, by 
definition, substantially free from components that 
naturally accompany it. Accordingly, substantially pure 
proteins include those having sequences derived from 
5 eukaryotic organisms but which have been recombinantly 
produced in E . coli or other prokaryotes. 

An antibody that "specifically binds" to an 
antigen is an antibody that recognizes and binds to the 
antigen, e.g., a dysferlin polypeptide, but which does 

10 not substantially recognize and bind to other molecules 
in a sample (e.g., a biological sample) which naturally 
includes the antigen, e.g., a dysferlin polypeptide. An 
antibody that "specifically binds" to dysferlin is 
sufficient to detect a dysferlin polypeptide in a 

15 biological sample using one or more standard 

immunological techniques (for example, Western blotting 
or immunoprecipitation) . 

A "transgene" is any piece of DNA, other than an 
intact chromosome, which is inserted by artifice into a 

20 cell, and becomes part of the genome of the organism 
which develops from that cell. Such a transgene may 
include a gene which is partly or entirely heterologous 
(i.e., foreign) to the host organism, or may represent a 
gene homologous to an endogenous gene of the organism. 

25 Unless otherwise defined, all technical and 

scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art 
to which- this invention belongs. Methods and materials 
similar or equivalent to those described herein can be 

30 used in the practice or testing of the present invention. 
The present materials, methods, and examples are 
illustrative only and not intended to be limiting. All 
publications, patent applications, patents, and other 
references mentioned herein are incorporated by reference 

35 in their entirety. In case of conflict, the present 
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specification, including definitions, will control. All 
the sequences disclosed in the sequence listing are meant 
to be double -stranded except the sequences of 
oligonucleotides . 
5 Other features and advantages of the invention 

will be apparent from the following detailed description, 
and from the claims. 

Brief Description of the Drawings 
Fig. 1A is a physical map of the MM locus. Arrows 

10 indicate the five new polymorphic markers and filled, 

vertical rectangular boxes indicate the previously known 
polymorphic markers. The five ESTs that are expressed in 
skeletal muscle are highlighted in bold. Detailed 
information on the minimal tiling path of the PAC contig 

15 spanning the MM/LGMD2B region is provided in Liu et al . , 
1998, Genomics 49:23-29. The minimal candidate MM region 
is designated by the solid bracket (top) and compared to 
the previous candidate region (dashed bracket) . TGFA and 
ADD 2 are transforming growth factor alpha and /?-adducin 

20 2 . 

Fig. IB is a representation of the dysferlin cDNA 
clones. The probes used in the three successive screens 
are shown in bold (130347, cDNAlO, A27-F2R2) . The two 
most 5' cDNA clones are also shown (B22, B33). The 6.9 

25 kb cDNA for dysferlin (SEQ ID NO:l) is illustrated at the 
bottom with start and stop codons as shown. 

Fig. 1C is a representation of the predicted 
dysferlin protein. The locations of four C2 domains 
(SEQ ID NOs: 86-89) are indicated by stippled boxes, 

30 while the putative transmembrane region is hatched. 

Vertical lines above the cDNA denote the positions of the 
mutations in Table 2; the associated labels indicate the 
phenotypes (MM - Miyoshi myopathy; LGMD - limb girdle 
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muscular dystrophy; DMAT - distal myopathy with anterior 
tibial onset) . 

Fig. 2 is the sequence of the predicted 2,080 
amino acids of dysferlin (SEQ ID NO: 2) . The predicted 
5 membrane spanning residues are in bold at the carboxy 
terminus (residues 2047-2063) . Partial C2 domains are 
underlined. Bold, underlined sequences are putative 
nuclear targeting residues. Possible membrane retention 
sequences are enclosed within a box. 

10 Fig. 3 is a comparison of the Kyle-Doolittle 

hydrophobicity plots of the dysferlin protein and fer-1. 
On the Y-axis, increasing positivity corresponds to 
increasing hydrophobicity. Both proteins have a single, 
highly hydrophobic stretch at the carboxy terminal end 

15 (arrow) . Both share regions of relative hydrophilicity 
approximately at residue 1,000 (arrowhead). 

Fig. 4 is a SSCP analysis of a representative 
pedigree with dysferlin mutations. Each member of the 
pedigree is illustrated above the corresponding SSCP 

20 analysis. For each affected individual (solid symbols) 
shifts are evident in alleles 1 and 2, corresponding 
respectively to exons 36 and 54. As indicated, the 
allele 1 and 2 variants are transmitted respectively from 
the mother and the father. The two affected daughters in 

25 this pedigree have the limb girdle muscular dystrophy 
(LGMD) phenotype while their affected brother has a 
pattern of weakness suggestive of Miyoshi myopathy (MM) . 

Fig. 5 is a representation of the genomic 
structure of dysferlin. The 55 exons of the dysferlin 

3 0 gene and their corresponding SEQ ID NOs are indicated 

below the 6911 bp cDNA (solid line) . The cDNA sequences 
corresponding to SEQ ID NO:l and SEQ ID NO : 3 are shown 
relative to the 6911 bp cDNA. 

Figs. 6A-B are the cDNA sequence of brain-specific 

35 dysferlin (SEQ ID NO: 232) and the predicted amino acid 
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sequence (in single-letter code) of brain-specific 
dysferlin (SEQ ID NO:233). 



Detailed Description 
The Miyoshi myopathy (MM) locus maps to human 
5 chromosome 2pl2-14 between the genetic markers D2S2 92 and 
D2S286 (Bejaoui et al . , 1995, Neurology 45 : 768-72) . 
Further refined genetic mapping in MM families placed the 
MM locus between markers GGAA-P7430 and D2S2109 (Bejaoui 
et al . , 1998, Neurogenetics 1:189-96). Independent 

10 investigation has localized the limb-girdle muscular 

dystrophy (LGMD-2B) to the same genetic interval (Bashir 
et al . , 1994, Hum. Molec . Genetics 3:455-57; Bashir et 
al . , 1996, Genomics 33:46-52; Passos-Bueno et al . , 1995, 
Genomics 27:192-95). Furthermore, two large, inbred 

15 kindreds have been described whose members include both 
MM and LGMD2B patients (Weiler et al . , 1996, Am. J. Hum. 
Genet. 59:872-78; Illarioshkin et al . , 1997, Genomics 
42:345-48). In these familial studies, the disease 
gene(s) for both MM and LGMD2B mapped to essentially the 

20 same genetic interval. Moreover, in both pedigrees, 

individuals with MM or LGMD2B phenotypes share the same 
haplotypes. This raises the intriguing possibility that 
the two diseases may arise from the same gene defect and 
that a particular disease phenotype is the result of 

25 modification by additional factors. 

A 3 -Mb PAC contig spanning the entire MM/LGMD2B 
candidate region was recently constructed to facilitate 
the cloning of the MM/LGMD2B gene(s) (Liu et al . , 1998, 
Genomics 49:23-29). This high resolution PAC contig 

3 0 resolved the discrepancies of the order of markers in 
previous studies (Bejaoui et al . , 1998, Neurogenetics 
1:189-96; Bashir et al . , 1996, Genomics 33:46-52; Hudson 
et al . , 1995, Science 270:1945-54). The physical size of 
the PAC contig also indicated that the previous minimal 
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size estimation based on YAC mapping data was 
significantly underestimated. 

Identification of Repeat Sequences and Repeat Typing 

The PAC contig spanning the MM/LGMD2B region (Liu 
5 et al . , 1998, Genomics 49:23-29) was used as a source for 
the isolation of new informative markers to narrow the 
genetic interval of the disease gene(s) . DNA from the 
PAC clones spanning the MM/LGMD2B region was spotted onto 
Hybond N+™ membrane filters (Amersham, Arlington Heights, 

10 IL) . The filters were hybridized independently with the 
following y- 32 P (Du Pont, Wilmington, DE) labeled repeat 
sequences: (1) (CA) 15 ; (2) pool of (ATT) 10 , (GATA) 8 and 
(GGAA) 8 ; (3) pool of ( GAAT ) 0 , (GGAT) 8 and ( GTAT ) e ; and (4) 
pool of (AAG) 10 and (ATC) 10 . Hybridization and washing of 

15 the filters were carried out at 55°C following standard 
protocols (Sambrook et al . , 1989, Molecular Cloning: A 
Laboratory Manual (2nd Edition) , Cold Spring Harbor 
Press, N. Y. ) . 

Miniprep DNAs of PAC clones containing repeat 

20 sequences were digested with restriction enzymes Hindlll 
and PstI and ligated into pBluescript II (KS+) vector 
which is (Stratagene, La Jolla, CA) digested with the 
same enzymes. Filters of the PAC subclones were 
hybridized to the y- 22 P labeled repeats that detected the 

25 respective PACs . For clones with an insert size greater 
than 1 kb the repeat sequences of which could not be 
identified by a single round of sequencing, the inserts 
were further subcloned by digestion with Haelll and 
ligation in EcoRV- digested pZero-2.1 vector (Invitrogen, 

30 Inc., Carlsbad, CA) . Miniprep DNAs of the positive, 
subclones were subjected to manual dideoxy sequencing 
with Sequenase™ enzyme (US Biochemicals , Inc., Cleveland, 
OH) . Primer pairs for amplifying the repeat sequences 
were selected using the computer program Oligo (Version 



BNSDOCID: <WO 001 1 157A1_IA> 



WO 00/11157 

- 14 - 

4.0, National Biosciences, Inc., 
sequences are shown in Table 1 . 
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Identification of Repeat Markers and Haplotvoe Analysis 

After hybridization with labeled repeat oligos, 
17 different groups of overlapping PACs were identified 
that contained repeat sequences. Some groups contained 
5 previously identified repeat markers. For example, five 
groups of PACs were positively identified by a pool of 
repeat probes including (ATT) 10 , (GATA) e , and (GGAA) 8 . Of 
these, three groups contained known markers GGAA-P7430 
(GGAA repeat) , D2S13 94 (GATA repeat) and D2S13 98 (GGAA 

10 repeat) (Hudson et al . , 1992, Nature 13:622-29; Gastier 
et al . , 1995, Hum. Molecular Genetics 4:1829-36). No 
attempt was made to isolate new repeat markers from these 
PACs and they were not further analyzed. Similarly, 
seven groups of PACs that contained known CA repeat 

15 markerswere excluded. Seven groups of PACs that 

contained unidentified repeats were retained for further 
analysis. For each group, the PAC containing the 
smallest insert was selected for subcloning. Subclones 
were re-screened and positive clones were sequenced to 

20 identify repeats. In total, seven new repeat sequences 
were identified within the MM/LGMD2B PAC contig. Of 
these, five are polymorphic within the population that 
was tested. The information for these five markers is 
summarized in Table 1. Based on the PAC contig 

2 5 constructed previously across the MM candidate locus (Liu 
et al . , 1998, Genomics 48:23-29), the five new markers 
and ten previously published polymorphic markers were 
placed in an unambiguous order (Fig. 1) . 

These markers were analyzed in a large, 

30 consanguineous MM family (Bejaoui et al . , 1995, Neurology 
45: 768-72; Bejaoui et al . , 1998, Neurogenetics 1:189- 
96) . Because MM is a recessive condition, the locus can 
be defined by identifying regions of the genome that show 
homozygosity in affected individuals. Conversely, 

35 because of the high penetrance of this adult -onset 
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condition, unaffected adult individuals are not expected 
to be homozygous by descent across the region. Analysis 
of haplotype homozygosity in this pedigree indicates that 
the disease gene lies between markers D2S2111 and PAC3- 
5 H52. Based on the PAC mapping data, the physical 

distance for this interval is approximately 2.0 Mb. No 
recombination events were detected between four 
informative markers (markers cyl72-H32 to PAC16-H41) and 
the disease locus in family MM-21 (Fig. 1A) . 

10 Identification of Five Muscle-Expressed ESTs 

Twenty- two ESTs and two genes (transforming growth 
factor alpha [TGFo?] and beta-adducin [ADD2] ) were 
previously mapped to the MM/LGMD2B PAC contig (Fig. 1A) 
(Liu et al . , 1998, Genomics 48:23-29). Two fil 

15 (approximately 0.1 ng//xl) of Marathon- ready™ skeletal 
muscle cDNA (Clontech, Palo Alto, CA) were used as 
template in a 10 fil PCR reaction for analysis of muscle 
expression of ESTs. The PCR conditions were the same as 
for the PCR typing of repeat markers. PCR analysis of 

20 skeletal muscle cDNA indicated that five of these ESTs 

(A006G04, stSG1553R, WI-14958, TIGR-A004Z44 andWI-14051) 
map within the minimal genetic MM interval of MM and are 
expressed in skeletal muscle. 

Probes were selected corresponding to each of 

25 these five ESTs for Northern blot analysis. cDNA clones 
(130347, 48106, 172575, 184080, and 510138) corresponding 
to the five ESTs that are expressed in muscle 
(respectively TIGR-A004Z44 , WI-14051, WI-14958, stSG1553R 
and A006G04) were selected from the UniGene database 

3 0 (http: /www. ncbi .nlm.nih.gov/UniGene/) and obtained from 
Genome Systems, Inc. (St. Louis, MO) . The cDNA probes 
were first used to screen the MM/LGMD2B PAC filters to 
confirm that they mapped to the expected position in the 
MM/LGMD2B contig. 
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A Northern blot (Clontech) of multiple human 
tissues was sequentially hybridized to the five cDNA 
probes and a control j8-actin cDNA at 65°C following 
standard hybridization and washing protocols (Sambrook et 
5 al . , supra). Between hybridizations, probes were removed 
by boiling the blot at 95-100°C for 4-10 min with 0.5% 
SDS. The blot was then re-exposed for 24 h to confirm 
the absence of previous hybridization signals before 
proceeding with the next round of hybridization. 
10 The tissue distribution, intensity of the signals 

and size of transcripts detected by the five cDNA probes 
varied. Probes corresponding to ESTs stSG1553R, TIGR- 
A004Z44 and WI- 14958 detected strong signals in skeletal 
muscle. In addition, the cDNA corresponding to TIGR- 
IS A004Z44 detected a 3.6-3.8 kb brain-specific transcript 
instead of the 8 . 5 kb message that was present in other 
tissues. It is likely that these five ESTs correspond to 
different genes since the corresponding cDNA probes used 
for Northern analysis derive from the 3' end of messages, 
20 map to different positions in the MM/LGMD2B contig (Fig. 
1A) , and differ in their expression patterns. 

Current database analysis suggests that three of 
these ESTs (stSG1553R, WI-14958 and WI-14051) do not 
match any known proteins (Schuler et al . , 1996, Science 
25 274:540-46) . A006G04 has weak homology with a protein 
sequence of unknown function that derives from C. 
elegans. TIGR-A004Z44 has homology only to subdomains 
present within protein kinase C. Because the five genes 
corresponding to the ESTs are expressed in skeletal 
3 0 muscle and map within the minimal genetic interval of the 
MM/LGMD2B gene (s) , they are candidate MM/LGMD2B gene(s). 
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Cloning of Dvsferlin cDNA 

EST TIGR-A004Z44 gave a particularly strong skeletal 
muscle signal on the Northern blot. Moreover, it is 
bracketed by genetic markers that show no recombination 
5 with the disease phenotype in family MM- 21 (Fig. 1) . The 
corresponding transcript was therefore cloned and 
analyzed as a candidate MM gene. From the Unigene 
database, a cDNA IMAGE clone (130347, 979 bp) was 
identified that contained the 483 bp EST TIGR-A004Z44 . 

10 Approximately 1 x 10 6 recombinant clones of a Xgtll 

human skeletal muscle cDNA library (Clontech) were plated 
and screened following standard techniques (Sambrook et 
al . , supra). The initial library screening was performed 
using the insert released from the clone 130347 that 

15 contains EST TIGR-A0044Z44 , corresponding to the 3' end 
of the gene. Positive phages were plaque purified and 
phage DNA was isolated according to standard procedures 
(Sambrook et al . , supra). The inserts of the positive 
clones were released by EcoJ^I digestion of phage DNA and 

2 0 subsequently subcloned into the EcoRI site of pBluescript 

II (KS+) vector (Stratagene) . 

Fifty cDNA clones were identified when a human 
skeletal muscle cDNA library was screened with the 130347 
cDNA. Clone cDNAlO with the largest insert (-6.5 kb) 
25 (Fig. IB) was digested independently with BamHI and PstI 
and further subcloned into pBluescript vector. Miniprep 
DNA of cDNA clones and subclones of cDNAlO was prepared 
using the Qiagen plasmid Miniprep kit (Valencia, CA) . 
Sequencing was carried out from both ends of each clone 

3 0 using the SequiTherm EXCEL™ long -read DNA sequencing kit 

(Epicenter, Madison, WI) , fluorescent-labeled M13 forward 
and reverse primers, and a LI -COR sequencer (Lincoln, 
NE) . Assembly of cDNA contigs and sequence analysis were 
performed using Sequencher software (Gene Codes 
35 Corporation, Inc., Ann Arbor, MI) . 
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Two additional screens, first with the insert of 
cDNAlO and then a 683 bp PCR product (A2 7-F2R2) amplified 
from the 5' end of the cDNA contig, identified 87 
additional cDNA clones. Clones B22 and B33 extended the 
5 5' end by 94 and 

20 bp, respectively. The compiled sequence allowed for 
the generation of a sequence of 6.9 kb (SEQ ID NO:l) 
(with 10 -fold average coverage) . 

Although the 5' end of the gene has not been further 

10 extended to the 8.5 kb predicted by Northern analysis, an 
open reading frame (ORF) of 6,243 bp has been identified 
within this 6.9 kb sequence. This ORF is preceded by an 
in- frame stop codon and begins with the sequence 
cgcaagcATGCTG (SEQ ID NO:118); five of the first seven bp 

15 are consistent with the Kozak consensus sequence for a 
start codon (Kozak, 1989, Nucl . Acids Res. 15:8125-33; 
Kozak, 1989, J". Cell. Biol. 108:229-41). An alternate 
start codon, in the same frame, +75 bp downstream, 
appears less likely as a start site GAGACGA TGGGG (SEQ ID 

20 NO:119). Thus, the entire coding region of this 

candidate gene is believed to have been identified, as . 
represented by the 6.9 kb sequence contig. 

Isolation of the Brain-Specific Dysferlin Isoform 

Identification of the brain- specif ic isoform of 

25 dysferlin 

A brain-specific isoform of dysferlin was identified 
using Northern blot analysis of poly(A+)RNA derived from 
multiple human adult tissues probed with radiolabeled 
full-length dysferlin cDNA subclones. A prominent 7.2 kb 

30 transcript was detected on Northern blots in skeletal 
muscle, heart, placenta, lung, and kidney, while a 
distinct but equally prominent 3.6 kb-3.8 kb transcript 
was identified exclusively in the brain. Using long 
exposures, a faint 7.2 kb mRNA was also detected in the 
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brain. This finding suggested that the shorter brain 
isoform was likely to be a tissue-specific splice variant 
of the dysferlin gene. To test this hypothesis, a human 
brain cDNA library (Stratagene) was screened for the 
5 dysferlin brain isoform. 

Cloning of the brain- specif ic dvsferlin isoform 
To identify probes that hybridize to the brain- 
specific dysferlin sequence and so could be used for 
library screening, fragments of the full-length dysferlin 
10 cDNA clone (derived from a skeletal muscle cDNA library) 
were generated using restriction enzymes. The fragments 
were about 1 kb in length and were analyzed by 
hybridization to a Northern blot that included brain RNA. 
Sequences suitable for library screening were those that 
15 hybridized to the 3.6-3.8 kb brain-specific transcript. 
A region of the 3' end of the dysferlin cDNA sequence 
that is approximately 3 kb in length was identified as 
hybridizing to brain mRNA. DNA containing sequence from 
this region was used as a probe for hybridization 
2 0 screening of a human brain cDNA library (Stratagene) . 

The human brain cDNA library was plated out and 
screened using standard procedures. Of the approximately 
720,000 plaques screened, 63 primary positive clones were 
identified. Of these, 2 0 clones were selected for 
2 5 further analysis involving standard methods of 
hybridization, restriction enzyme mapping, and 
sequencing. The primary positive clones shared regions 
of overlap with each other. 

Sequencing of positive clones, provided 3671 
30 nucleotides of the brain-specific dysferlin sequence (SEQ 
ID NO: 232; Figure 6A-B) . The identified sequence 
corresponds closely to the size of the brain-specific 
dysferlin transcript detected on Northern blots. With 
the exception of the 5' region of the sequence, the 



BNSDOCJD: <WO 0011157A1JA> 



WO 00/1 1 1 57 PCT7US99/1 9395 

- 22 - 

brain-specific sequence is identical to about 3.1 kb of 
the dysferlin sequence (from nucleotide 3722 to 6904 of 
the dysferlin sequence) . In the dysferlin gene, position 
3722 corresponds to the start of exon 32. This finding 
5 is consistent with the hypothesis that the brain isoform 
is a splice-variant of the dysferlin gene. At the 5' end 
of the brain isoform, 48 9 nucleotides are unique to 
brain-specific dysferlin. The amino acid sequence 
encoded by the brain dysferlin nucleic acid sequence (SEQ 
10 ID NO: 233; Figure 6) contains a unique sequence with an 
initiation codon within a Kozak consensus sequence. The 
nucleic acid sequence unique to brain-specific dysferlin 
encodes a novel 24 amino acid sequence. 

Identification of Mutations in Mivoshi Myopathy 
15 Two strategies were used to determine whether this 

6.9 kb cDNA (SEQ ID NO : 1 ) is mutated in MM. First, the 
genomic organization of the corresponding gene was 
determined and the adjoining intronic sequence at each of 
the 55 exons which make up the cDNA was identified. To 

2 0 identify exon-intron boundaries within the gene, PAC DNA 

was extracted with the standard Qiagen -Mini Prep 
protocol . Direct sequencing was performed with DNA 
Sequence System (Promega, Madison, WI) using 32 P end- 
labeled primers (Benes et al . , 1997, Bio techniques 23:98- 
25 100) . Exon-intron boundaries were identified as the 

sites where genomic and cDNA sequences diverged. Second, 
in patients for whom muscle biopsies were available, RT- 
PCR was also used to prepare cDNA for the candidate gene 
from the muscle biopsy specimen. 

3 0 Single strand conformational polymorphism analysis 

(SSCP) was used to screen each exon in patients from 12 
MM families. Putative mutations identified in this way 
were confirmed by direct sequencing from genomic DNA 
using exon-specif ic intronic primers. Approximately 20 
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ng of total genomic DNA from immortalized lymphocyte cell 
lines were used as a template for PCR amplification 
analysis of each exon using primers (below) located in 
the adjacent introns. SSCP analysis was performed as 
previously described (Aoki et a!-, 1998, Ann. Neurol. 
43:645-53) . In patients for whom muscle biopsies were 
available, mRNA was isolated using RNA-STAT-60™ (Tel- 
Test, Friendswood, TX) and first -strand cDNA was 
synthesized from 1-2 fig total RNA with MMLV reverse 
transcriptase and random hexamer primers (Life 
Technologies, Gaithersburg, MD) . Three jxl of this 
product were used for PCR amplification. Eight sets of 
primers were designed for muscle cDNA, and overlapping 
cDNA fragments suitable for SSCP analysis were amplified. 
After initial denaturation at 94°C for 2 min, 
amplification was performed using 30 cycles at 94°C for 30 
s, 56°C for 30 s, and 72°C for 60 s. The sequences of 
polymorphisms detected by SSCP analysis were determined 
by the dideoxy termination method using the Sequenase kit 

(US Biochemicals) . In some instances, the base pair 
changes predicted corresponding changes in restriction 
enzyme recognition sites. Such alterations in 
restriction sites were verified by digesting the relevant 
PCR products with the appropriate restriction enzymes. 
Primer pairs used for SSCP screening and exon 

sequencing are as follows: 

(1) exon 3, F3261 5 ' - tctcttctcctagagggccatag-3 ' (SEQ 
ID NO: 101) and R326 5 ' -ctgttcctccccatcgtctcatgg- 3 ' (SEQ 
ID NO: 102) ; 

(2) exon 20, F3121 5 ' -gctcctcccgtgaccctctg- 3 ' (SEQ 
ID NO: 103) and R3121 5 ' -gggtcccagccaggagcactg- 3 ' (SEQ ID 
NO: 104) ; 

(3) exon 36, F2102 5 ' -cccctctcaccatctcctgatgtg-3 ' 
(SEQ ID NO: 105) and R2111 5 ' - tggcttcacctt ccctctacctcgg- 
3' (SEQ ID NO: 106) ; 
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(4) exon 49, F1081 5 ' - tcctttggtaggaaatctaggtgg- 3 ' 
(SEQ ID NO: 107) and R1081 5 ' -ggaagctggacaggcaagagg- 3 ' 
(SEQ ID NO: 108) ; 

(5) exon 50, F1091 5 ' -atatactgtgttggaaatcttaatgag-3 ' 
5 (SEQ ID NO: 109) and R1091 5 ' -gctggcaccacagggaatcgg-3 ' 

(SEQ ID NO: 110) ; 

(6) exon 51, F1101 5 ' -ctttgcttccttgcatccttctctg-3 ' 
(SEQ ID NO: 111) and R1101 5 ' -agcccccatgtgcagaatggg- 3 ' 
(SEQ ID NO: 112) ; 

10 (7) exon 52, Fllll 5 ' -ggcagtgatcgagaaacccgg-3 ' (SEQ 

ID NO: 113) and Rllll 5 ' -catgccctccactggggctgg-3 ' (SEQ ID 
NO: 114) ; 

(8) exon 54, F1141 5 ' -ggatgcccagttgactccggg- 3 ' (SEQ ID 
NO: 115) and R1141 5 ' -ccccaccacagtgtcgtcagg- 3 ' (SEQ ID NO: 

15 116) ; 

(9) exon 29, F3031 5 ' -aagtgccaagcaatgagtgaccgg- 3 ' (SEQ 
ID NO: 184) and R3021 5' -ctcactcccacccaccacctg-3' (SEQ ID 
NO: 185) ; 

(10) exon 31, F2141 5 ' -gaatctgccataaccagcttcgtg- 3 ' (SEQ 
20 ID NO: 188) and R2141 5 ' - tatcaccccatagaggcctcgaag- 3 ' (SEQ ID 

NO: 189) ; 

(11) exon 32, F2981 5 ' -cagccactcactctggcacctctg-3 ' (SEQ 
ID NO: 190) and R2981 5 ' -agcccacagtctctgactctcctg- 3 ' (SEQ ID 
NO: 191) ; 

25 (12) exon 43, F2031 5 ' -cagccaaaccatatcaacaatg-3 ' (SEQ 

ID NO: 210) and R2021 5 ' -ctggggaggtgagggctctag- 3 ' (SEQ ID 
NO: 211) ; 

(13) exon 44, F2011 5 ' -gaagtgttttgtctcctcctc-3 ' (SEQ ID 
NO: 212) and R2011 5' -gcaggcagccagcccccatc-3' (SEQ ID NO: 

30 213) ; 

(14) exon 46, F1041 5 ' -ctcgtctatgtcttgtgcttgctc- 3 ' (SEQ 
ID NO: 216) and R1051 5 ' - caccatggtttggggtcatgtgg- 3 ' (SEQ ID 
NO: 217) . 
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These primers were used in SSCP screening and exon 
sequencing, and identified eighteen different mutations 
in fifteen families (Table 2) . 
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15 



20 



25 



Twelve of the eighteen different mutations are predicted 
to block dysferlin expression, either through nonsense or 
frameshift changes. Seven of the thirteen samples are 
homozygous and thus expected to result in complete loss 
of dysferlin function. For each mutated exon in these 
patients, at least 50 control DNA samples (100 
chromosomes) were screened to determine the frequencies 
of the sequence variants. When possible, the parents and 
siblings of affected individuals were also screened to 
verify that defined mutations were appropriately co- 
inherited with the disease in each pedigree (Fig. 4) . In 
two families (50, 58 in Table 2) heterozygous mutations 
were identified in one allele (respectively a missense 
mutation and a 2 bp deletion) . Mutations in the other 
allele are presumed to have not been detected (or in 
three of the screened MM families) either because the 
mutant and normal SSCP products are indistinguishable or 
because the mutation lies outside of coding sequence 
(i.e., in the promoter or a regulatory region of an 
intron) . The disease-associated mutations did not appear 
to arise in the population as common polymorphisms. 

More mutations can be identified by using 
appropriate primer pairs to amplify an exon and analyze 
its sequence. The following primer pairs are useful for 
exon amplification. 

Exon Code Primer Sequence 

1 F408 5 ' -gacccacaagcggcgcctcgg-3 ' {SEQ ID 

NO: 130} 



F4101 



5' -gaccccggcgagggtggtcgg-3 ' { SEQ ID 



NO: 131} 



2 



F4111 



5' -tgtctctccattctcccttttgtg-3 ' { SEQ ID 



NO: 13 2 } 



R4111 



5' -aggacactgctgagaaggcacctc~3 ' {SEQ ID 



NO: 133} 
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3 F3262 5-agtgccctggtggcacgaagg-3 ' {SEQ ID 



NO: 134} 

NO: 13 5} 
5 4 
NO: 13 6} 

NO: 13 7} 
5 

10 NO: 138} 

NO: 13 9} 
6 

NO: 14 0} 

15 

NO: 141} 
7 

NO: 142} 

20 NO: 143} 
8 

NO: 144} 

NO: 145} 
25 9 

NO: 146} 



R3261 5-cctacctgcaccttcaagccatgg-3 ' { SEQ ID 



F3251 5-cagaagagccagggtgccttagg-3 ' {SEQ ID 



R3251 5-ccttggaccttaacctggcagagg-3 ' { SEQ ID 



F3242 5-cgaggccagcgcaccaacctg-3 ' { SEQ ID 



R3242 5-actgccggccattcttgctggg-3 ' {SEQ ID 



F3231 5-ccaggcctcattagggccctc-3 ' { SEQ ID 



R3231 5-ctgaagaggagcctggggtcag-3 ' {SEQ ID 



F3222 5-ctgagatttctgactcttggggtg-3 ' {SEQ ID 



R3211 5-aaggttctgccctcatgccccatg-3 ' {SEQ ID 



F3561 5-ctggcctgagggatcagcagg-3 ' {SEQ ID 



R3561 5-gtgcatacatacagcccacggag-3 ' {SEQ ID 



F3551 5-gagctattgggttggccgtgtggg-3' { SEQ ID 



R3552 5-accaacacggagaagtgagaactg-3 ' {SEQ ID 



NO: 14 7} 

10 F3201 
30 ID NO: 148} 

R3201 

NO: 149} 

11 F3191 
ID NO: 150} 



5~ccacactttatttaacgctttggcgg-3 ' {SEQ 



5 - cagaaccaaaatgcaaggat acgg- 3 ' {SEQ ID 



5-cttctgattctgggatcaccaaagg-3 ' {SEQ 
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F3191 

NO: 151} 

12 F3181 
NO: 152} 



NO: 153} 

13 F3171 
NO: 154} 



10 NO: 155} 

14 F3161 
NO: 156} 



NO: 157} 
15 15 F3541 

NO: 158} 

R3541 
ID NO: 159} 

16 F3531 
20 NO: 160} 



NO: 161} 

17 F3151 
NO: 162} 



25 



NO: 163} 

18 F3141 
NO: 164} 



30 NO: 165} 

19 F3522 

166} 



NO: 167} 



- 32 - 

5-ggaccgtaaggaagacccaggg-3 ' { SEQ ID 
5-cctgtgctcaggagcgcatgaagg-3 ' {SEQ ID 



R3181 5-gcagacctcccacccaagggcg-3 ' {SEQ ID 



5-gagacagatgggggacagtcaggg-3 ' {SEQ ID 



R3171 5-cctcccgagagaaccctcctg-3' { SEQ ID 



5-gggagcccagagtccccatgg-3 ' {SEQ ID 



R3161 5-gggcctccttgggtttgctgg-3' {SEQ ID 



5-gcctccccagcatcctgccgg-3 ' {SEQ ID 



5 - t cactgagccgaatgaaac tgagg - 3 ' { SEQ 



5-tgtggcctgagttcctttcctgtg-3 ' {SEQ ID 



R3 531 5-ggtcaaagggcagaacgaagaggg-3 ' {SEQ ID 



5-cccgtccttctcccagccatg-3 ' { SEQ ID 



R3151 5-ctcccctggttgtccccaagg-3 ' {SEQ ID 



5-cgacccctctgattgccacttgtg-3 ' {SEQ ID 



R3141 5-ggcatcctgcccttgccaggg-3 ' { SEQ ID 



5-tctgtctcccctgctccttg-3 ' {SEQ ID NO: 



R3522 5-cttccctgccccgacgcccag-3 ' {SEQ ID 



BNSDOCID: <WO 001 1 157A1 JA> 



WO 00/11157 



PCT/US99/19395 



33 



20 F3121 
NO: 103} 



NO: 104} 
5 21 F3111 

NO: 168} 



NO: 169} 

22 F3512 
10 NO: 170} 



NO: 171} 

23 F3101 
NO: 172} 



15 



NO: 173} 

24 F3082 
NO: 174} 



20 ID NO:175} 

25 F3073 
NO: 176} 

R3073 

NO: 177} 
25 26 F3061 

ID NO: 178} 

R3051 

NO: 179} 

27 F3601 
30 ID NO: 180} 

R3601 

NO: 181} 

28 F3501 
ID NO: 182} 



5-gctcctcccgtgaccctctgg-3 '- { SEQ ID 



R3121 5-gggtcccagccaggagcactg-3 ' {SEQ ID 



5-cagcgctcaggcccgtctctc-3 ' {SEQ ID 



R3111 5-tgcataggcatgtgcagctttggg-3 ' {SEQ ID 



5-catgcaccctctgccctgtgg-3 ' {SEQ ID 



R3512 5-agttgagccaggagaggtggg-3 ' {SEQ ID 



5 -catcaggcgcattccatctgtccg-3 ' { SEQ ID 



R3 0 91 5-agcaggagagcagaagaagaaagg-3 ' {SEQ ID 



5-gtgtgtcaccatccccaccccg-3 ' { SEQ ID 



R3 0 82 5 -caagagatgggagaaaggccttatg-3 ' {SEQ 



5 -ctgggacatccggatcctgaagg-3 ' { SEQ ID 



5-tccaggtagtgggaggcagagg-3 ' { SEQ ID 



5-tcccactacctggagctgccttgg-3 ' {SEQ 



5-ggctctccccagccctccctg-3 ' {SEQ ID 



5 -cagagcagcagagactctgaccag-3 ' {SEQ 



5 -tagaccccacctgcccctgag-3 ' {SEQ ID 



5-tcctctcattgcttgcctgttcgg-3 ' {SEQ 
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R3501 

NO: 183} 

29 F3031 
ID NO: 184} 

5 R3021 
NO: 185} 

30 F3011 
NO: 186} 

R3001 
10 ID NO: 187} 

31 F2141 
ID NO: 188} 

R2141 
ID NO: 189} 
15 32 F2981 

ID NO: 190} 

R2981 
ID NO: 191} 

33 F2131 
20 ID NO: 192} 

R2211 

NO: 193} 

34 F2202 
ID NO: 194} 

25 R2202 
NO: 195} 

35 F2111 
NO: 196} 

R2112 
30 ID NO: 197} 

36 F2102 
ID NO: 105} 

R2111 
ID NO: 106} 



5-ttgagagcttgccggggatgg-3 ' {SEQ ID 



5-aagtgccaagcaatgagtgaccgg-3 ' { SEQ 



5-ctcactcccacccaccacctg-3 ' {SEQ ID 



5-cccaccggcctctgagtctgc-3 ' {SEQ ID 



5 -accctacccaagccaggacaagtg- 3 ' { SEQ 



5 -gaatctgccataaccagcttcgtg- 3 ' { SEQ 



5-tatcaccccatagaggcctcgaag-3 9 {SEQ 



5-cagccactcactctggcacctctg-3 ' {SEQ 



5-agcccacagtctctgactctcctg-3 ' {SEQ 



5 -acatctctcagggtccctgctgtg-3 ' { SEQ 



5-cctgtgaggggacgaggcagg-3 ' {SEQ ID 



5-gccctgggtaagggatgctgattc-3 ' { SEQ 



5-cctgcctgggcctcctggatc-3 ' {SEQ ID 



5-gagggtgatgggggccttagg-3 ' {SEQ ID 



5 -gcaatcagtttgaagaaggaaagg-3 ' { SEQ 



5 -cccctctcaccatctcctgatgtg-3 ' {SEQ 



5-ggcttcaccttccctctacctcgg-3 ' {SEQ 
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37 F2101 
ID NO: 198} 

R2101 

NO: 199} 
5 38 F2091 

ID NO: 200} 

R2091 

NO: 201} 

39 F2081 
10 NO: 202} 

R2071 

NO: 203} 

40 F2061 
NO: 204} 

15 R2061 
NO: 2 05} 

41 F2051 
ID NO: 206} 

R2051 
20 ID NO: 207} 

42 F2041 
ID NO: 208} 

R2041 

NO: 209} 
25 43 F2031 

NO: 210} 

R2021 

NO: 211} 

44 F2011 
30 NO: 212} 

R2011 

NO: 213} 

45 F1021 
NO: 214} 



- 35 - 

5-cacctttgtctccattctacctgc-3 ' { SEQ 



5-ctcccagcccccacgcccagg- 3 ' { SEQ ID 



5-ctgagccactctcctcattctgtg-3 ' {SEQ 



5-tggaaggggacagtagggagg-3' {SEQ ID 



5-ggccagtgcgttcttcctcctc-3 ' {SEQ ID 



5-tccctgacctgcccatcatctc-3 ' {SEQ ID 



5-gcccctgtcaggcctggatgg-3 ' { SEQ ID 



5- tgacccaggcctccctggagg-3 ' { SEQ ID 



5-ctgaaatggtctctttctttctac-3 ' {SEQ 



5 - cacaccgactgt cagactgaagag- 3 ' { SEQ 



5- ttgtcccctcctctaatccccatg-3 ' {SEQ 



5-gggttagggacgtcttcgagg-3 ' {SEQ ID 



5-cagccaaaccatatcaacaatg-3 ' {SEQ ID 



5-ctggggaggtgagggctctag-3 ' {SEQ ID 



5-gaagtgttttgtctcctcctc-3 ' {SEQ ID 



5-gcaggcagccagcccccatc-3 ' {SEQ ID 



5-gggtgccctgtgttggctgac-3 ' {SEQ ID 
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R1031 

NO: 215} 

46 F1041 
ID NO: 216} 

5 R1051 
NO: 217} 

47 F1061 
NO: 218} 

R1061 

10 NO: 219} 

48 F1071 
NO: 22 0} 

R1071 

NO: 221} 
15 49 F1082 

ID NO: 222} 

R1082 

NO: 223} 

50 F1092 
20 ID NO: 224} 

R1091 

NO: 110} 

51 F1102 
ID NO: 225} 

25 R1101 
NO: 112} 

52 F1112 
ID NO: 22 6} 

R1112 

30 NO: 227} 

53 F1121 
ID NO: 228} 

R1121 

NO: 229} 



- 36 - 

5-gcaggcagccagcccccatc-3 ' {SEQ ID 
5-ctcgtctatgtcttgtgcttgctc-3 ' {SEQ 
5-caccatggtttggggtcatgtgg-3 ' {SEQ ID 
5-tctcgcttccccagctcctgc-3 ' {SEQ ID 
5- tctggagttcgaggactctggg-3 ' {SEQ ID 
5-agaagggtggggagagaacgg-3 ' {SEQ ID 
5-cagctcagagcctgtggctgg-3 ' {SEQ ID 
5-aaggccttcccatcctttggtagg-3 ' {SEQ 
5 -acaacccagagggagcacggg-3 ' {SEQ ID 
5-gttgacgatgtatatactgtgttgg-3 ' {SEQ 
5-gctggcaccacagggaatcgg-3 ' {SEQ ID 
5-gcctctctctaactttgcttccttg-3 ' { SEQ 
5-agcccccatgtgcagaatggg-3 ' {SEQ ID 
5-ggctacaggctggcagtgatcgag-3 ' {SEQ 
5- ttcccccatgccctccactgg-3 ' {SEQ ID 
5-agccttcgtgcccctaaccaagtg-3 ' {SEQ 
5-ctgtgggcattggggctcagg-3 ' {SEQ ID 
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54 F1141 5-ggatgcccagttgactccggg-3 ' {SEQ ID 

NO: 115} 

R1141 5-ccccaccacagtgtcgtcagg-3 ' { SEQ ID 

NO: 116} 

5 55 F1151 5-gccccagtgggatcaccatg-3 ' {SEQ ID 

NO: 230} 

R116 5-atgctggaggggaccccacgg-3 ' { SEQ ID 

NO: 231} 

Comparison of Dvsferlin With Other Proteins 

10 The 6,243 bp ORF of this candidate MM gene is 

predicted to encode 2,080 amino acids (Figs. 1C and 2; 
SEQ ID NO: 2) . At the amino acid level, this protein is 
highly homologous to the nematode ( Caenorhabdi tis 
elegans) protein fer-1 (27% identical, 57% identical or 

15 similar: the sequence alignment and comparison was 
performed using http://vega.igh.cnrs.fr/bin/nph- 
align_query.pl.) (Argon & Ward, 1980, Genetics 96:413-33; 
Achanzar & Ward, 1997, J". Cell Science 110:1073-81). 
This dystrophy-associated, fer-l-like protein has 

20 therefore been designated "dysf erlin . " 

The fer-1 protein was originally identified through 
molecular genetic analysis of a class of fertilization- 
defective C. elegans mutants in which spermatogenesis is 
abnormal (Argon & Ward, 1980, Genetics 96:413-33). The 

25 mutant fer-1 spermatozoa have defective mobility and show 
imperfect fusion of membranous organelles (Ward et al . , 
1981, J". Cell Bio. 91:26-44). Like fer-1, dysferlin is a 
large protein with an extensive, highly charged 
hydrophilic region and a single predicted membrane 

3 0 spanning region at the carboxy terminus (Fig. 3) . There 
is a membrane retention sequence 3' to the membrane 
spanning stretch, indicating that the protein may be 
preferentially targeted to either endoplasmic or 
sarcoplasmic reticulum, probably as a Type II protein 
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(i.e. with the NH 2 end and most of the following protein 
located within the cytoplasm) (Fig. 1C) . Several nuclear 
membrane targeting sequences are predicted within the 
cytoplasmic domain of the protein 
5 (http://psort.nibb.ac.jp/form.html) . Immunocytochemical 
detection of dysferlin suggests that dysferlin is 
targeted to or anchored within the sarcoplasmic 
reticulum. 

The cytoplasmic component of this protein contains 

10 four motifs homologous to C2 domains. C2 domains are 

intracellular protein modules composed of 80 - 130 amino 
acids (Rizo & Sudhof, 1998, J". Biol. Chem. 273:15897). 
Originally identified within a calcium-dependent isoform 
of protein kinase C (Nishizuka, 1988, Nature 334:661-65), 

15 C2 domains are present in numerous proteins. These 
domains often arise in approximately homologous pairs 
described as double C2 or D0C2 domains. One DOC2 
protein, DOC2a, is brain specific and highly concentrated 
in synaptic vesicles (Orita et al . , 1995, Biochem. 

20 Biophys. Res. Comm. 206:439-48), while another, DOC2/?, is 
ubiquitously expressed (Sakaguchi et al . , 1995, Biochem. 
Biophys. Res. Comm. 217:1053-61). Many C2 modules can 
fold to bind calcium, thereby initiating signaling events 
such as phospholipid binding. At distal nerve 

25 terminals, for example, the synaptic vesicle protein 
synaptotagmin has two C2 domains that, upon binding 
calcium, permit this protein to interact with syntaxin, 
triggering vesicle fusion with the distal membrane and 
neurotransmitter release (Sudhof & Rizo, 1996, Neuron 

30 17:379-88) . 

The four dysferlin C2 domains are located at amino 
acid positions 32-82, 431-475, 1160-1241, and 1582-1660 
(Figs. 1C and 3). Indeed, it is almost exclusively 
through these regions that dysferlin has homology to any 

35 proteins other than fer-1. Each of these segments in 
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dysferlin is considerably smaller than a typical C2 
domain. Moreover, these segments are more widely 
separated in comparison with the paired C2 regions in 
synaptotagmin, DOC2o? and j8 and related C2 -positive 
5 proteins. For this reason, it is difficult to predict 
whether the four relatively short C2 domains in dysferlin 
function analogously to conventional C2 modules. That 
dysferlin might, by analogy with synaptotagmin, signal 
events such as membrane fusion is suggested by the fact 

10 that fer-1 deficient worms show defective membrane 

organelle fusion within spermatozoa (Ward et al . , 1981, 
J. Cell Bio. 91:26-44). 

The invention will be further described in the 
following examples, which do not limit the scope of the 

15 invention described in the claims. 

EXAMPLES 

Example 1: Production of dysferlin protein 

Standard methods can be used to synthesize either 
wild type or mutant dysferlin, or fragments of either. 

20 These methods can also be used to synthesize brain- 
specific dysferlin polypeptides including full-length or 
fragments (e.g., a polypeptide unique to brain-specific 
dysferlin) . For example, a recombinant expression vector 
encoding dysferlin (or a fragment thereof: e.g., 

25 dysferlin minus its membrane -spanning region) operably 

linked to appropriate expression control sequences can be 
used to express dysferlin in a prokaryotic (e.g., E.coli) 
or eukaryotic host (e.g., insect cells, yeast cells, or 
mammalian cells) . The protein is then purified by 

30 standard techniques. If desired, DNA encoding part or 
all of the dysferlin sequence can be joined in- frame to 
DNA encoding a different polypeptide, to produce a 
chimeric DNA that encodes a hybrid polypeptide. This can 
be used, for example, to add a tag that will simplify 

35 identification or purification of the expressed protein, 
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or to render the dysferlin (or fragment thereof) more 
immunogenic . 

The preferred means for making short peptide 
fragments of dysferlin is by chemical synthesis. These 
5 fragments, like dysferlin itself, can be used to generate 
antibodies, or as positive controls for antibody-based 
assays . 

Fusion proteins are useful, e.g., for generating 
antibodies. Such fusion proteins are generated using 

10 known methods. In one example, to construct glutathione 
S- transferase (GST) : dysferlin fusion proteins, the BLAST 
program (Altschul et al . , 1990, J. Molec. Biol. 215:403- 
410) was used to identify three regions of the dysferlin 
cDNA that show no homology to any known human proteins 

15 (Figure 1) . These were subcloned from the dysferlin cDNA 
as BstYI (881-1333), XmnI (1990-2718) and Sail (5364- 
5732) fragments ligated respectively into BamHI, Smal and 
Sail sites of pGEX-5X-3 (Pharmacia) . The three fragments 
correspond to amino acid sequences at amino acid 

20 locations 253-403, 624-865, and 1664-1786 of SEQ ID NO : 2 , 
respectively. The resulting GST fusion proteins of BamHI 
(43 kDa) and Smal (53.3 kDa) formed isoluble aggregates 
that were isolated by SDS-PAGE. The fusion protein of 
Sail (40.2 kDa) was soluble and thus could be purified 

25 using a glutathione Sepharose 4B column; the Sail 

dysferlin fragment (14.2 kDa) was isolated by cleavage 
from GST using Factor Xa protease. The eluted protein 
was concentrated and further purified by SDS-PAGE. For 
all three of the fusion peptides, the resulting SDS-PAGE 

30 bands were excised and used to immunize rabbits. 

Example 2: Production and characterization of anti- 
dvsf erlin antibodies 

Techniques for generating both monoclonal and 
polyclonal antibodies specific for a particular protein 
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are well known. The antibodies can be raised against a 
short peptide epitope of dysferlin, an epitope linked to 
a known immunogen to enhance immunogenicity , a long 
fragment of dysferlin, or the intact protein. Antibodies 
5 can also be raised against brain-specific dysferlin 
polypeptides, e.g., against amino acids 1-24 of SEQ ID 
NO: 233. Such antibodies raised against dysferlin or 
brain-specific dysferlin polypeptides are useful for 
e.g., localizing such polypeptides in tissue sections or 

10 fractionated cell preparations and diagnosing dysferlin- 
related disorders. 

An isolated dysferlin protein, or a portion or 
fragment thereof , can be used as an immunogen to generate 
antibodies that bind dysferlin using standard techniques 

15 for polyclonal and monoclonal antibody preparation. The 
dysferlin immunogen can also be a mutant dysferlin or a 
fragment of a mutant dysferlin. A full-length dysferlin 
protein can be used or, alternatively, antigenic peptide 
fragments of dysferlin can be used as immunogens . The 

20 antigenic peptide of dysferlin comprises at least 8 

(preferably 10, 15, 20, or 30) amino acid residues of the 
amino acid sequence shown in SEQ ID NO: 2 and encompasses 
an epitope of such that an antibody raised against the 
peptide forms a specific immune complex with dysferlin. 

25 Preferred epitopes encompassed by the antigenic peptide 
are regions of dysferlin that are located on the surface 
of the protein, e.g., hydrophilic regions. 

A dysferlin immunogen typically is used to prepare 
antibodies by immunizing a suitable subject (e.g., 

3 0 rabbit, goat, mouse or other mammal) with the immunogen. 
An appropriate immunogenic preparation can contain, for 
example, recombinant ly expressed dysferlin protein or a 
chemically synthesized dysferlin polypeptide. The 
preparation can further include an adjuvant, such as 

35 Freund's complete or incomplete adjuvant, or similar 
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immunostimulatory agent. Immunization of a suitable 
subject with an immunogenic dysferlin preparation induces 
a polyclonal ant i -dysferlin antibody response. 

Polyclonal anti-dysf erlin antibodies ("dysferlin 
5 antibodies") can be prepared as described above by 

immunizing a suitable subject with a dysferlin immunogen. 
The dysferlin antibody titer in the immunized subject can 
be monitored over time by standard techniques, such as 
with an enzyme linked immunosorbent assay (ELISA) using 

10 immobilized dysferlin. If desired, the antibody 

molecules directed against dysferlin can be isolated from 
the mammal (e.g., from the blood) and further purified by 
well-known techniques, such as protein A chromatography 
to obtain the IgG fraction. At an appropriate time after 

15 immunization, e.g., when the dysferlin antibody titers 
are highest, antibody-producing cells can be obtained 
from the subject and used to prepare monoclonal 
antibodies by standard techniques, such as the hybridoma 
technique originally described by Kohler and Milstein 

20 (1975) Nature 256:495-497, the human B cell hybridoma 

technique (Kozbor et al . (1983) Immunol . Today 4:72), the 
EBV-hybridoma technique (Cole et al . (1985), Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96) or trioma techniques. The technology for 

25 producing hybridomas is well known (see generally Current 
Protocols in Immunology (1994) Coligan et al . (eds.) John 
Wiley & Sons, Inc., New York, NY). Briefly, an immortal 
cell line (typically a myeloma) is fused to lymphocytes 
^typically splenocytes) from a mammal immunized with a 

30 dysferlin immunogen as described above, and the culture 
supernatants of the resulting hybridoma cells are 
screened to identify a hybridoma producing a monoclonal 
antibody that binds dysferlin. 

Any of the many well known protocols used for fusing 

35 lymphocytes and immortalized cell lines can be applied 
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for the purpose of generating a monoclonal antibody 
against dysferlin (see, e.g., Current Protocols in 
Immunology, supra; Galfre et al . (1977) Mature 266:55052; 
R.H. Kenneth, in Monoclonal Antibodies : A New Dimension 
5 In Biological Analyses, Plenum Publishing Corp., New 
York, New York (1980); and Lerner (1981) Yale J. Biol. 
Med., 54:387-402. Moreover, the one in the art will 
appreciate that there are many variations of such methods 
which also would be useful. Hybridoma cells producing a 

10 monoclonal antibody of the invention are detected by 
screening the hybridoma culture supernatants for 
antibodies that bind dysferlin, e.g., using a standard 
ELISA assay. 

Alternative to preparing monoclonal antibody- 

15 secreting hybridomas, a monoclonal dysferlin antibody can 
be identified and isolated by screening a recombinant 
combinatorial immunoglobulin library (e.g., an antibody 
phage display library) with dysferlin to thereby isolate 
immunoglobulin library members that bind dysferlin. Kits 

20 for generating and screening phage display libraries are 
commercially available (e.g., the Pharmacia Recombinant 
Phage Antibody System, Catalog No. 27-9400-01; and the 
Stratagene SurfZAP™ Phage Display Kit, Catalog No. 
240612) . Additionally, examples of methods and reagents 

25 particularly amenable for use in generating and screening 
antibody display library can be found in, for example, 
U.S. Patent No. 5,223,409; PCT Publication No. WO 
92/18619; PCT Publication No. WO 91/17271; PCT 
Publication No. WO 92/20791; PCT Publication No. WO 

30 92/15679; PCT Publication No. WO 93/01288; PCT 

Publication No. WO 92/01047; PCT Publication No. WO 
92/09690; PCT Publication No. WO 90/02809; Fuchs et al . 
(1991) Bio/Technology 9:1370-1372; Hay et al . (1992) Hum. 
Antibod. Hybridomas 3:81-85; Huse et al . (1989) Science 
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246:1275-1281; Griffiths et al . (1993) EMBO J". 12:725- 
734 . 

As an example, two polyclonal antisera were raised 
for each of the fusion peptide antigens described above 
5 using New Zealand White rabbits. The rabbits were 
injected with 0.5 mg of antigen using keyhole limpet 
hemocyanin (KLH) as the adjuvent. Booster injections of 
0.25 mg antigen were administered every three weeks over 
12 weeks. Serum was prepared from the rabbits and was 

10 purified using affinity column chromatography (HiTrap; 
Pharmacia) or antigen-blotted polyvinyl idene difluoride 
( PVDF ) membrane . 

Immunoblotting was used to verify that the affinity- 
purified antisera recognize the cognate fusion peptides 

15 by Western immunoblotting (WIB) and that this reactivity 
was immunoadsorbed by pre -incubation of the antisera with 
the peptides. Thus, antiserum raised against the 
polypeptide encoded by the Sail fragment (encoding amino 
acids 1664-1786) identified the fragment both as a 

20 cleaved, 14.2 kDa fragment and as a component of the 40.2 
kDa GST-Sail fusion peptide. No reactivity was evident 
in the fraction containing only the GST fusion partner. 
Immunoadsorption entirely abolished this staining. 
Analogous results were detected with all six antisera (to 

25 the three different target fusion peptides) . 

Preparation of subcellular fractions 

Frozen human muscle (0.3 g) was homogenized in five 
volumes of 0.25 M sucrose containing proteinase inhibitor 
(Complete, Boehringer) . Subcellular fractions of nuclei, 
30 mitochondria, microsomes, and cytosol were separated by 
differential centrif ugation . The purity of each fraction 
was evaluated by immunoblotting of fraction- specific 
proteins with antibodies to histone HI (Calbiochem) , 
cytochrome c (Santa Cruz) , Na + -K + ATPase Ctrl subunit 
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(Research Diagnostics) and cytosolic superoxide dismutase 
(Calbiochem) . 



Dvsferlin in subcellular fractions 
Immunoblotting was used to analyze dysferlin 
5 expression. Twenty ^g of each subcellular fraction and 
4 0 fig of whole homogenate of muscle were separated by 
SDS-PAGE (4-15% gradient gel) and transferred to a 
nitrocellulose membrane. immunoblotting was performed 
according to standard methods, using chemiluminescence 

10 (ECL, Amersham) . Immunoblotting of multi-tissue blots 

identified prominent dysferlin positively at 
approximately 23 0 kDa in heart, placenta, skeletal muscle 
and kidney. Little or no immuno- positive staining was 
detected in brain, liver, spleen, ovary, or testis. 

15 Lower molecular weight bands (approximately 4 0 kDa) were 
also evident. Immunoadsorpt ion with the corresponding 
fusion peptide abolished both the large and the smaller 
bands. The 230 kDa band was observed with all of the 
affinity purified, anti-dysf erlin antisera. 

20 Immunoblotting of fractionated human muscle 

documented distinct 23 0 kDa bands in the whole muscle 
homogenate an in microsomal and nuclear fractions. Some 
immunoreactivity was also evident in the nuclear and 
mitochondrial fractions. No immunoreactivity was 

25 detected in the cytosolic fractions. This pattern was 
seen with all of the ant i -dysferlin antisera, and was 
eliminated by immunoadsorpt ion . The identity of the 
assayed fractions was verified by Western blotting using 
fraction-specific antibodies: histone HI for the nuclear 

30 fraction, cytochrome c for the mitochondrial fraction, 

Na + -K + ATPase ofl-subunit for the microsomal fraction, and 
S0D1 for the cytosolic fraction. 



Example 3 : Diagnosis 
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The discovery of mutations in the dysferlin gene 
that are associated with the MM and LMGD2B phenotypes 
means that individuals can be tested for the disease gene 
before symptoms appear. This will permit genetic testing 
5 and counseling of those with a family history of the 
disease. Additionally, individuals diagnosed with the 
genetic defect can be closely monitored for the 
appearance of symptoms, thereby permitting early 
intervention, including genetic therapy, as appropriate. 

10 Individuals with a brain-specific dysf erlin-related 
disorder can be diagnosed using such methods. 

Diagnosis can be carried out on any suitable genomic 
DNA sample from the individual to be tested. Typically, 
a blood sample from an adult or child, or a sample of 

15 placental or umbilical cord cells of a newborn would be 
used; alternatively, one could utilize a fetal sample 
obtained by amniocentesis or chorionic villi sampling. 

It is expected that standard genetic diagnostic 
methods can be used. For example, PCR can be utilized to 

20 identify the presence of a deletion, addition, or 

substitution of one or more nucleotides within any one of 
the exons of dysferlin. Following the PCR reaction, the 
PCR product can be analyzed by methods such as a 
heteroduplex detection technique based upon that of White 

25 et al . (1992, Genomics 12:301-06), or by techniques such 
as cleavage of RNA-DNA hybrids using RNase A (Myers et 
al . , 1985, Science 230:1242-46), single-stranded 
conformation polymorphism (SSCP) analysis (Orita et al . , 
1989, Genomics 10:298-99), di -deoxy- fingerprinting (DDF) 

30 (Blaszyk et al . , 1995, Biotechniques 18: 256-260) and 
denaturing gradient gel electrophoresis (DGGE; Myers et 
al . , 1987, Methods Enzymol . 155:501-27). The PCR may be 
carried out using a primer which adds a G+C rich sequence 
(termed a "GC-clamp") to one end of the PCR product, thus 

35 improving the sensitivity of the subsequent DGGE 
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procedure (Sheffield et al . , 1989, Proc . Natl. Acad. Sci . 
USA 86:232-36) . If the particular mutation present in 
the patient's family is known to have removed or added a 
restriction site, or to have, significantly increased or 
5 decreased the length of a particular restriction 

fragment, a protocol based upon restriction fragment 
length polymorphism (RFLP) analysis (perhaps combined 
with PCR) may be appropriate. 

The apparent genetic heterogeneity resulting in the 

10 MM/LGMD2B phenotypes means that the nature of the 

particular mutation carried by affected individuals in 
the patient's family may have to be ascertained prior to 
attempting genetic diagnosis of the patient* 
Alternatively, a battery of tests designed to identify 

15 any of several mutations known to result in MM/LGMD2B may 
be utilized to screen individuals without a defined 
familial genotype. The analysis can be carried out on 
any genomic DNA derived from the patient, typically from 
a blood sample. 

20 Instead of basing the diagnosis on analysis of the 

genomic DNA of a patient, one could seek evidence of the 
mutation in the level or nature of the relevant 
expression products. Well-known techniques for analyzing 
expression include mRNA-based methods, such as Northern 

25 blots and in situ hybridization (using a nucleic acid 
probe derived from the relevant cDNA) , and quantitative 
PCR (as described in St- Jacques et al . , 1994, 
Endocrinology 134:2645-57) . One could also employ 
polypeptide based methods, including the use of 

30 antibodies specific for the polypeptide of interest. 
These techniques permit quantitation of the amount of 
expression of a given gene in the tissue of interest, at 
least relative to positive and negative controls. One 
would expect an individual who is heterozygous for a 

35 genetic defect affecting the level of expression of 
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dysferlin to show up to a 50% loss of expression of this 
gene in such a hybridization or antibody-based assay. An 
antibody specific for the carboxy terminal end would be 
likely to pick up (by failure to bind to) most or all 
5 frameshift and premature termination signal mutations, as 
well as deletions of the carboxy terminal sequence. Use 
of a battery of monoclonal antibodies specific for 
different epitopes of dysferlin would be useful for 
rapidly screening cells to detect those expressing mutant 

10 forms of dysferlin (i.e., cells which bind to some 
dysf erlin-specif ic monoclonal antibodies, but not to 
others) , or for quantifying the level of dysferlin on the 
surface of cells. One could also use a protein 
truncation assay (Heim et al . , 1994, Nature Genetics 

15 8:218-19) to screen for any genetic defect which results 
in the production of a truncated polypeptide instead of 
the wild type protein. 

Use of immunodetection to identify normal and 
disease-associated dysferlin 

2 0 In the following example, immunodetection methods 

are used to demonstrate a detectable difference in 
muscles homogenates between normal and disease -associated 
dysferlin alleles . 

Frozen muscle samples (quadriceps) were homogenized 
25 in ten volumes of SDS-PAGE sample buffer and boiled for 5 
minutes. The final loading volume of SDS-PAGE was 
adjusted after densitometric measurements (NIH Image) of 
myosin heavy chain on the Coomassie blue stained gels. 
Studies were performed on six MM , two LGMD-2B, and three 

3 0 normal muscle samples. 

Immunocytochemistry was performed on 8 micron 
cryostat sections of the muscle that were fixed in 100% 
cold acetone for 5 minutes and preincubated with PBS 
containing 1% BSA, 5% heat -inactivated goat serum and 
35 0.2% Triton®X-100 . The sections were incubated with 
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primary antibodies overnight at 4°C and fluorescein- 
labeled secondary (TAGO Immunologicals) for 30 minutes at 
room temperature. The primary antibodies were applied in 
two double staining combinations: Sall-l anti-dysf erlin 
5 and ant i- dystrophin antibodies, and SalI-2 anti-dysf erlin 
and anti-6-sarcoglycan antibodies. The sections were 
mounted in SlowFade (Molecular Probes) . 

The 23 0 kDA antigen was absent in samples from all 
five MM patient in immunoblot assays. All five patients 

10 had normal patterns of dystrophin expression. Genetic 
analysis of the dysferlin gene in the patients predicted 
that at least two of the five MM patients should have no 
full-length protein. Two of the other three patients had 
mutations in at least one allele that are predicted to 

15 eliminate normal dysferlin expression. In all five 
patients, absence of dysferlin immuno- staining was 
documented with at least two other ant i -dysferlin anti- 
sera . 

Immunostaining of dysferlin, dystrophin and 6- 
2 0 sarcoglycan proteins demonstrated distinct membrane- 
associated positivity for each protein in normal muscle. 
By contrast, in both MM and LGMD-2B muscle the dysferlin 
protein was absent, while the dystrophin and 5- 
sarcoglycan proteins appeared normal. 

2 5 Therapeutic Treatment 

' A patient with MM/LGMD2B, or an individual 
genetically susceptible to contracting one or both of 
these diseases, can be treated by supplying dysferlin 
therapeutic agents of the present invention. Dysferlin 

3 0 therapeutic agents include a DNA or a subgenomic 

polynucleotide coding for a functional dysferlin protein. 
A DNA (e.g., a cDNA) is prepared which encodes the wild 
type form of the gene operably linked to expression 
control elements (e.g., promoter and enhancer) that 
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induce expression in skeletal muscle cells or any other 
affected cells. The DNA may be incorporated into a 
vector appropriate for transforming the cells, such as a 
retrovirus, adenovirus, or adeno-associated virus. One 
5 of the many other known types of techniques for 

introducing DNA into cells in vivo may be used (e.g., 
liposomes) . Particularly useful would be naked DNA 
techniques, since naked DNA is known to be readily taken 
up by skeletal muscle cells upon injection into muscle. 

10 Wildtype dysferlin protein can also be administered to an 
individual who either expresses mutant dysferlin protein 
or expresses an inadequate amount of dysferlin protein, 
e.g., a MM/LGMD2B patient. 

Administration of the dysferlin therapeutic agents 

15 of the invention can include local or systemic 

administration, including injection, oral administration, 
particle gun, or catheterized administration, and topical 
administration. Various methods can be used to 
administer the therapeutic dysferlin composition directly 

20 to a specific site in the body. For example, a specific 
muscle can be located and the therapeutic dysferlin 
composition injected several times in several different 
locations within the body of the muscle. The 
therapeutic dysferlin composition can be directly 

25 administered to the surface of the muscle, for example, 

by topical application of the composition. X-ray imaging 
can be used to assist in certain of the above delivery 
methods. Combination therapeutic agents, including a 
dysferlin protein or polypeptide or a subgenomic 

3 0 dysferlin polynucleotide and other therapeutic agents, 
can be administered simultaneously or sequentially. 

Receptor-mediated targeted delivery of therapeutic 
compositions containing dysferlin subgenomic 
polynucleotides to specific tissues can also be used. 

35 Receptor-mediated DNA delivery techniques are described 
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in, for example, Findeis et al . (1993), Trends in 
Biotechnol. 11, 202-05; Chiou et al . (1994), Gene 
Therapeutics: Methods and Applications of Direct Gene 
Transfer (J. A. Wolff, ed.); Wu & Wu (1988), J . Biol. 
5 Chew. 263, 621-24; Wu et al . (1994), J* . Biol. Chem. 269, 
542-46; Zenke et al . (1990), Proc . Natl. Acad. Sci . 
U.S.A. 87, 3655-59; Wu et al . (1991), J. Biol. Chem. 266, 
338-42 . 

Alternatively, a dysferlin therapeutic composition 

10 can be introduced into human cells ex vivo, and the cells 
then implanted into the human. Cells can be removed from 
a variety of locations including, for example, from a 
selected muscle. The removed cells can then be contacted 
with the dysferlin therapeutic composition utilizing any 

15 of the above-described techniques, followed by the return 
of the cells to the human, preferably to or within the 
vicinity of a muscle. The above-described methods can 
additionally comprise the steps of depleting fibroblasts 
or other contaminating non-muscle cells subsequent to 

20 removing muscle cells from a human. 

Both the dose of the dysferlin composition and the 
means of administration can be determined based on the 
specific qualities of the therapeutic composition, the 
condition, age, and weight of the patient, the 

25 progression of the disease, and other relevant factors. 
If the composition contains dysferlin protein or 
polypeptide, effective dosages of the composition are in 
the range of about 1 \xg to about 100 mg/kg of patient 
body weight, e.g., about 50 fig to about 50 mg/kg of 

30 patient body weight, e.g., about 500 /xg to about 5 mg/kg 
of patient body weight. 

Therapeutic compositions containing dysferlin 
subgenomic polynucleotides can be administered in a range 
of about 0.1 /ig to about 10 mg of DNA/dose for local 

35 administration in a gene therapy protocol. Concentration 
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ranges of about 0.1 fig to about 10 mg, e.g., about 1 fig 
to about 1 mg, e.g., about 10 fig to about 100 fig of DNA 
can also be used during a gene therapy protocol. Factors 
such as method of action and efficacy of transformation 
5 and expression are considerations that will effect the 
dosage required for ultimate efficacy of the dysferlin 
subgenomic polynucleotides. Where greater expression is 
desired over a larger area of tissue, larger amounts of 
dysferlin subgenomic polynucleotides or the same amounts 

10 readministered in a successive protocol of 

administrations, or several administrations to different 
adjacent or close tissue portions of for example, a 
muscle site, may be required to effect a positive 
therapeutic outcome. In all cases, routine 

15 experimentation in clinical trials will determine 
specific ranges for optimal therapeutic effect. 

Animal Model 

A line of transgenic animals (e.g., mice, rats, 
guinea pigs, hamsters, rabbits, or other mammals) can be 
20 produced bearing a transgene encoding a defective form of 
dysferlin. Standard methods of generating such 
transgenic animals would be used, e.g., as described 
below. 

Alternatively, standard methods of producing null 
25 (i.e., knockout) mice could be used to generate a mouse 
which bears one defective and one wild type allele 
encoding dysferlin. If desired, two such heterozygous 
mice could be crossed to produce offspring which are 
homozygous for the mutant allele. The homozygous mutant 
3 0 offspring would be expected to have a phenotype 

comparable to the human MM and/or LGMD2B phenotype, and 
so serve as models for the human disease. 

For example, in one embodiment, dysferlin mutations 
are introduced into a dysferlin gene of a cell, e.g., a 
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fertilized oocyte or an embryonic stem cell. Such cells 
can then be used to create non-human transgenic animals 
in which exogenous altered (e.g., mutated) dysferlin 
sequences have been introduced into their genome or 
5 homologously recombinant animals in which endogenous 

dysferlin nucleic acid sequences have been altered. Such 
animals are useful for studying the function and/or 
activity of dysferlin and for identifying and/or 
evaluating modulators of dysferlin function. As used 

10 herein, a "transgenic animal" is a non-human animal, 

preferably a mammal, more preferably a rodent such as a 
rat or mouse, in which one or more of the cells of the 
animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, 

15 dogs, cows, goats, chickens, amphibians, etc. A 

transgene is exogenous DNA which is integrated into the 
genome of a cell from which a transgenic animal develops 
and which remains in the genome of the mature animal, 
thereby directing the expression of an encoded gene 

2 0 product in one or more cell types or tissues of the 
transgenic animal. As used herein, an "homologously 
recombinant animal" is a non-human animal, preferably a 
mammal, more preferably a mouse, in which an endogenous 
dysferlin gene has been altered by homologous 

2 5 recombination between the endogenous gene and an 

exogenous DNA molecule introduced into a cell of the 
animal, e.g., an embryonic cell of the animal, prior to 
completed development of the animal. 

A transgenic animal of the invention can be created 
30 by introducing a nucleic acid encoding a dysferlin 

mutation into the male pronuclei of a fertilized oocyte, 
e-g./ by microinjection or retroviral infection, and 
allowing the oocyte to develop in a pseudopregnant female 
foster animal. A dysferlin cDNA sequence e.g., that of 

3 5 (SEQ ID NO:l or SEQ ID NO : 3 ) can be introduced as a 
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transgene into the genome of a non-human animal. 
Alternatively, a nonhuman homologue of the human 
dysferlin gene can be isolated based on hybridization to 
the human dysferlin sequence (e.g., cDNA) and used as a 
5 transgene. Intronic sequences and polyadenylation 

signals can also be included in the transgene to increase 
the efficiency of expression of the transgene. Methods 
for generating transgenic animals via embryo manipulation 
and microinjection, particularly animals such as mice, 

10 have become conventional in the art and are described, 
for example, in U.S. Patent Nos . 4,736,866 and 
4,870,009, U.S. Patent No. 4,873,191 and in Hogan, 
Manipulating the Mouse Embryo, (Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y. , 1986). 

15 Similar methods are used for production of other 

transgenic animals. A transgenic founder animal can be 
identified based upon the presence of the mutant 
dysferlin transgene in its genome and/or expression of 
the mutant dysferlin mRNA in tissues or cells of the 

20 animals. A transgenic founder animal can then be used to 
breed additional animals carrying the transgene. 
Moreover, transgenic animals carrying a transgene 
encoding a mutant dysferlin can further be bred to other 
transgenic animals carrying other transgenes . 

25 To create an homologously recombinant animal, a 

vector is prepared which contains at least a portion of a 
dysferlin gene into which a deletion, addition or 
substitution has been introduced to thereby alter a 
dysferlin gene. In a preferred embodiment, the vector is 

3 0 designed such that, upon homologous recombination, the 
endogenous dysferlin gene is functionally disrupted 
(i.e., no longer encodes a functional protein; also 
referred to as a "knock out" vector) . Alternatively, the 
vector can be designed such that, upon homologous 

35 recombination, the endogenous dysferlin gene is mutated 
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or otherwise altered (e.g., contains one of the mutations 
described in Table 2) . In the homologous recombination 
vector, the altered portion of the dysferlin sequence is 
flanked at its 5' and 3' ends by additional nucleic acid 
5 of the dysferlin gene to allow for homologous 

recombination to occur between the exogenous dysferlin 
nucleic acid sequence carried by the vector and an 
endogenous dysferlin gene in an embryonic stem cell. The 
additional flanking dysferlin nucleic acid is of 

10 sufficient length for successful homologous recombination 
with the endogenous gene. Typically, several kilobases 
of flanking DNA (both at the 5' and 3' ends) are included 
in the vector (see, e.g., Thomas and Capecchi (1987) Cell 
51:503 for a description of homologous recombination 

15 vectors) . The vector is introduced into an embryonic 
stem cell line (e.g., by electroporation) and cells in 
which the introduced dysferlin sequence has homologously 
recombined with the endogenous dysferlin gene are 
selected (see, e.g., Li et al . (1992) Cell 69:915). The 

20 selected cells are then injected into a blastocyst of an 
animal (e.g., a mouse) to form aggregation chimeras (see, 
e.g., Bradley in Teratocarcinomas and Embryonic Stew 
Cells: A Practical Approach, Robertson, ed. ( IRL , Oxford, 
1987) pp. 113-152). A chimeric embryo can then be 

25 implanted into a suitable pseudopregnant female foster 

animal and the embryo brought to term. Progeny harboring 
the homologously recombined DNA in their germ cells can 
be used to breed animals in which all cells of the animal 
contain the homologously recombined DNA by germline 

30 transmission of the transgene . Methods for constructing 
homologous recombination vectors and homologous 
recombinant animals are described further in Bradley 
(1991) Current Opinion in Bio/Technology 2:823-829 and in 
PCT Publication Nos . WO 90/11354, WO 91/01140, WO 

35 92/0968, and WO 93/04169. 
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Other Embodiments 
It is to be understood that while the invention has 
been described in conjunction with the detailed 
description thereof, the foregoing description is 
5 intended to illustrate and not limit the scope of the 
invention, which is defined by the scope of the appended 
claims. Other aspects, advantages, and modifications are 
within the scope of the following claims. 
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What is claimed is: 

1. An isolated DNA comprising a nucleotide sequence 
which hybridizes under stringent hybridization conditions 
to SEQ ID NO: 3, or a complement thereof. 

5 2. The isolated DNA of claim 1, wherein the 

nucleotide sequence is SEQ ID NO: 117. 

3 . An isolated DNA comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NOs:4-12. 

4. The isolated DNA of claim 3, comprising the 

10 sequence of SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ 
ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ 
ID NO: 20, or SEQ ID NO: 21. 

5. An isolated DNA comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NOS : 22-30. 

15 6. A single stranded oligonucleotide of 14-50 

nucleotides in length having a nucleotide sequence 
identical to a portion of SEQ ID NO: 3, or a complement 
thereof . 

7. A pair of PCR primers consisting of: 

(a) a first single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides that are 
identical to a portion of SEQ ID NO: 117; and 

(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides that are 
identical to a portion of SEQ ID NO: 117, wherein the 
sequence of at least one of the oligonucleotides is 
identical to a portion of a strand of SEQ ID NO : 3 , and 
the first oligonucleotide is not complementary to the 
second oligonucleotide. 



20 



25 
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8. A pair of single- stranded oligonucleotides, 
wherein both oligonucleotides are selected from the group 
consisting of SEQ ID NOS:130-231, SEQ ID N0:110, and SEQ 
ID NO: 112 and the oligonucleotides are different from 
5 each other. 



9 . An isolated DNA comprising a nucleotide sequence 
that encodes a polypeptide that shares at least 70% 
sequence identity with SEQ ID NO: 2, or a complement of 
the nucleotide sequence. 

10 10. The isolated DNA of claim 9, wherein the 

polypeptide comprises the sequence of SEQ ID NO: 2. 

11. An isolated DNA comprising a nucleotide 
sequence which hybridizes under stringent hybridization 
conditions to a nucleic acid having a sequence selected 
15 from the group consisting of SEQ ID NOs: 31-79 and 90-100. 



12. A single stranded oligonucleotide of 14-50 
nucleotides in length comprising a nucleotide sequence 
which is identical to a portion of a nucleic acid 
selected from the group consisting of SEQ ID NOs : 31-79 

20 and 90-100, or a complement of the nucleotide sequence. 

13. The oligonucleotide of claim 12, wherein the 
portion includes an intronic sequence. 

14. A pair of PCR primers consisting of: 
(a) a first single- stranded oligonucleotide 

2 5 consisting of 14-50 contiguous nucleotides that are 
identical to a portion of a sense strand of a nucleic 
acid selected from the group consisting of SEQ ID NOs : 31- 
85; and 
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(b) a second single stranded oligonucleotide 
consisting of 14-50 contiguous nucleotides that are 
identical to a portion of the antisense strand of a 
nucleic acid selected from the group consisting of SEQ ID 
5 NOs 131-85, wherein the sequence of at least one of the 
oligonucleotides comprises a sequence identical to a 
portion of a nucleic acid selected from SEQ ID NOs : 31-79 
and 90-100, and wherein the first oligonucleotide is not 
complementary to the second oligonucleotide. 

10 15. A pair of single-stranded oligonucleotides 

selected from the group consisting of SEQ ID NOs : 101-116 , 
SEQ ID NOs:184-185, SEQ ID NOs:188-191, SEQ ID NOs:210- 
213, and SEQ ID NOs:216-217. 

16. A vector comprising the isolated DNA of claim 

15 1. 

17. A substantially pure polypeptide comprising an 
amino acid sequence sharing at least 70% sequence 
identity with SEQ ID NO: 2. 

18. The substantially pure polypeptide of claim 17, 
20 wherein the polypeptide comprises an amino acid sequence 

identical to that of a naturally occurring polypeptide. 

19. The substantially pure polypeptide of claim 18, 
wherein the amino acid sequence comprises the sequence of 
SEQ ID NO: 2. 

25 20. A substantially pure polypeptide comprising an 

amino acid sequence identical to the amino acid sequence 
of amino acid residues 1-500, 501-1000, 1001-1500, or 
1501-2080 of SEQ ID NO : 2 . 
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21. A substantially pure polypeptide comprising the 
amino acid sequence of SEQ ID NO: 86, SEQ ID NO: 87, SEQ ID 
NO: 88 or SEQ ID NO: 89. 

22. A substantially pure polypeptide selected from 
5 the group consisting of amino acids 253-403 of SEQ ID 

NO: 2, amino acids 624-8 65 of SEQ ID NO: 2, and amino acids 
1664-1786 Of SEQ ID NO : 2 . 

23. A fusion protein comprising a polypeptide of 
claim 22 . 

10 24 . An antibody that specifically binds to the 

polypeptide of claim 22. 

25. An antibody that binds specifically to the 
polypeptide of claim 17. 

26. A cell comprising the isolated DNA of claim 1. 

15 27. A non-human mammal, the genomic DNA of which 

bears a transgene, wherein the transgene comprises the 
isolated DNA of claim 1. 

28. A transgenic non-human mammal having a 
transgene disrupting or interfering with the expression 

20 of a dysferlin gene. 

29. A method of decreasing the symptoms of muscular 
dystrophy in a mammal, the method comprising introducing 
into a cell of said mammal the isolated DNA of claim 1. 

30. A method of decreasing the symptoms of muscular 
25 dystrophy in a mammal, the method comprising introducing 
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into a cell of said mammal the vector of claim 16, the 
vector being an expression vector. 

31. A method of decreasing the symptoms of muscular 
dystrophy in a mammal, the method comprising introducing 

5 into a cell of said mammal the protein of claim 17. 

32. A method for identifying a patient, a fetus, or 
a pre-embryo at risk for having a dysf erlin-related 
disorder, the method comprising: 

(a) obtaining a sample of genomic DNA from the 
10 patient, fetus, or pre-embryo; and 

(b) determining whether the sample contains a 
mutation in a dysferlin gene, wherein a patient, a fetus, 
or a pre-embryo having a mutation in a dysferlin gene is 
at risk for having a dysf erlin-related disorder. 

33. The method of claim 32, comprising: 

(a) treating the sample of genomic DNA with a 
restriction enzyme specific for a particular restriction 
enzyme site; and 

(b) detecting the presence or absence of the 
particular restriction enzyme site in the sample of 
genomic DNA as an indication of the presence or absence 
of a particular mutation in the genomic DNA. 

34. The method of claim 33, wherein the restriction 
enzyme is selected from the group consisting of Pst I, 

25 Fnu4H I, BamH I, BstY I, Ava II, HinP I, Fsp I, Mbo II, 
ScrF I, BstN I, Mae I, Bfa I, Dde I, Bpm I, Ban II, Ava 
II, and Sau96 I. 

35. The method of claim 32, comprising subjecting 
the sample to polymerase chain reaction (PCR) . 
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36. The method of claim 32, comprising: 

(a) contacting a single stranded oligonucleotide 

with the sample of genomic DNA; and 

(c) detecting hybridization or lack thereof between 
5 the single stranded oligonucleotide and the genomic DNA, 

as an indication of the presence or absence of a mutation 

in the genomic DNA. 



37. A method for identifying a patient, a fetus, or 
a pre-embryo at risk for having a dysf erlin-related 

10 disorder, said method comprising: 

(a) providing a sample comprising dysferlin mRNA 
from the patient, fetus, or pre -embryo; and 

(b) determining whether the dysferlin mRNA contains 
a mutation, wherein a patient, a fetus, or a pre-embryo 

15 having a dysferlin mRNA containing a mutation is at risk 
for having a dysf erlin-related disorder. 

38. The method of claim 37, wherein the presence or 
absence of the mutation is detected by Northern blot. 

39. The method of claim 37, wherein the method 

20 includes the step of subjecting the sample to polymerase 
chain reaction (PCR) . 



40. A method for detecting the absence of a 
mutation in a dysferlin protein of a patient, a fetus, or 
a pre-embryo, the method comprising: 
25 (a) providing a sample comprising a dysferlin 

protein of the patient, fetus, or pre-embryo ; 

(b) contacting the sample with the antibody of 
claim 22; and 

(c) detecting binding of the antibody to dysferlin 
30 protein in the sample, if any, wherein binding indicates 

a normal dysferlin protein. 
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41. An isolated DNA comprising a nucleotide sequence 
that is identical to the sequence of amino acid residues 
3501-3520 of SEQ ID NO : 1 , 3737-3756 of SEQ ID NO : 1 , 3842- 
3861 of SEQ ID NO : 1 , 5114-5139 of SEQ ID NO : 1 , or 5239- 
5255 of SEQ ID NO:l. 

42 . An isolated DNA comprising a nucleotide 
sequence selected from the group consisting of 

3501-35.20 of SEQ ID NO:l, wherein nucleotide G at 
3510 is A; 

3737-3756 of SEQ ID NO : 1 , wherein nucleotide G at 
3746 is deleted; 

3842-3861 of SEQ ID NO : 1 , wherein nucleotide C at 
3851 is T; 

5114-5139 of SEQ ID NO : 1 , wherein nucleotide C at 
5122 and nucleotide A at 5123 are deleted; 

5239-5255 of SEQ ID NO : 1 , wherein nucleotide G at 
5245 is deleted and nucleotide G at 5249 is C; and 

5239-5255 of SEQ ID NO : 1 , wherein nucleotide G at 
5245 is C and nucleotide G at 5249 is deleted. 

43. An isolated nucleic acid comprising a 
nucleotide sequence which hybridizes under stringent 
hybridization conditions to nucleic acids 3284-3720 of 
SEQ ID NO: 232, or the complement of said nucleotide 
sequence . 

44. An isolated nucleic acid comprising a 
nucleotide sequence identical to the sequence of 
nucleotides 3284-3720 of SEQ ID NO: 232, or a complement 
of said nucleotide sequence. 

45. The isolated nucleic acid of claim 44, wherein 
the nucleotide sequence comprises the sequence of SEQ ID 
NO: 232 or the complement of SEQ ID NO: 2 32. 
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46. An isolated polypeptide comprising: 

a) at least 15 contiguous amino acids of the 
polypeptide comprising amino acids 1-24 of SEQ ID NO: 233, 

b) a naturally occuring allelic variant of a 

5 polypeptide comprising amino acids 1-24 of SEQ ID NO: 233, 
or 

c) an amino acid sequence which is encoded by a 
nucleic acid molecule which hybridizes under stringent 
conditions to nucleotides 3284-3720 of SEQ ID NO: 232. 

10 47. The polypeptide of claim 46, wherein the 

polypeptide comprises SEQ ID NO:233. 

48. A vector comprising the nucleic acid of claim 

44 . 

49. A cell comprising the vector of claim 48. 

15 50. A method of making a polypeptide, the method 

comprising culturing the cell of claim 49. 

51. An antibody which specifically binds to a 
polypeptide of claim 46. 

52. The antibody of claim 51, wherein the antibody 
2 0 binds to a polypeptide selected from the group comprising 

amino acids 253-403 of SEQ ID NO:233, amino acids 624-865 
of SEQ ID NO: 233, and amino acids 1664-1786 of SEQ ID 
NO:233 . 

53. The antibody of claim 51, wherein the antibody 
2 5 is a monclonal antibody. 

54. The antibody of claim 51, wherein the antibody 
is a polyclonal antibody. 
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201 
251 



1001 
1051 



1 ICZZEILVAE NVHTPDT3IS EAYCSAVFAG VZEZIJZIZZ^ SyVgVW MSG- 
51 tt L ttZ?' .. ?, CGSZlrW, LGSAXVPLRE VLA7PSLSAS 

101 FMAPLL07XX CP7CASLVLQ VSY7PLPGAV PLFPPP7PLE PSP7LPDLDV 
151 VAT7GGEED7 E0QGL7GDEA EPFLEQSGG? GAP77- RXLP SRPPPSYPG^ 
1S£RSA?7SR XLLSDXPCDF C:RVQV2EGR QLPGW^CPV VXV7AAGQ7X 
R7RIKXGNS? LFNE7LFFNL FESPGELFSE PIFITV/ESR SLR7DALLG" 
201 FP-MDVG7IYR EPRHAYLRXW L1L5EPDEFS AGARCYLX~S LCVLG PGDEA 
3 51 PLERXDPSED XSDIESNLLR P7GVALRCAH FCLXVFRASD LPOVDDAVJ-D 
401 NVXQIFGFES NXXNLVDPFV EVSFAGXMLC QWNQNITV^ 

451 ^?srcs?~? :?.:T3wpnr.7 sbh-attyi ImIxisapgg E^pIcAV 

501 KPSXASDLDD YI,GFL?7FG? CYI^YGSPR EF7G F? DP Y7 EUV7GKGEGV 
551 AYRGRLLLSL E7KLVEKSSQ XVEDLPADDI LRVEXYT.RRR EYSLFAAFYS 
601 A7MLQDVDDA IQFEVSIGNY GNXFEMTCL? LAST7QYSRA VFDGCHYYYL 
551 PWGNVXPWV LSSYWEDISH R1ETQNQLLG I AD RL EAGLE QVHLADKAQC 
701 STEDVDSLVA QLTDELIAGC SQPLGDIHET PSATKLDQYL YQLRTHHLSQ 
751 I7SAALALKL GKSELPAALE QAEEWLLRLR ALAESPQNSL PDIVIWMLOC 
801 DXRVAYQRVP AHQVLFSRRG ANYCGXNCGK LQTIFLXYPM EXVPGARMPV 
8S1 QIRVXLWFGL SVDEXSFNQF AEGXLSVFAE TYZNE7XLAL VGKWG7TGL'" 
901 YPXFSDVTGX IXLPKDSFR? SAGWTWAGEW FVCPEX7LLH DMDACHLSFV 
951 EEVFENQ7RL FGGQWIYHSD NY7DVNG EXV LPKDDIECPL GWKW^DE— WS 
7ELNRAVDEQ GWEYSI7I22 SRKPK KWVPA EXMYYT gRRR fiWVRLRRRD' 
SCMEALSS OAEAEGEGWE YASLFGWXFH LEYRX7EAFR EESWRRil."" 
1101 LEXTG ? AAVF ALEGALGGVM DDKSEDSMSV S7LSFGV>nU? Sici-DVGN 
1151 RYHLRCYMYQ AP.DLAAJfflKD SFsnavaryc; pr.un^n-^nr ynnrr* .ii*Tun 

1201 gujasisi -ss?,^7Vft-n ppsiw^.vn ^n^Z gScQPrS 

1251 RMPRLAWFPL 7RGSQPSGSL LASrELIQRS KPAIKHIPGF EVQETSRILD 
13 01 ESSDTDLPYP PPQREANIYM VPQNIKPALQ RTAIEILAWG LRNMXSYQLA 
13 51 NISSPSLWE CGGQTVQSCV IRNLRXNPNF DIC7LFMEVM LPREELYCP* 
1401 ITVXVIDNRQ FGRRPWGQC 7IRSLESFLC DPYSAESPS? QGGPDDVSLL 
1451 SPGSDVLIDI DDKEPLIPIQ EESFIDWWSK FFAS1GEREX CGSYLEKDFD 
1501 7L.XVYDTQLE NVEAFEGLSD FCNTFXLYRG K7QEE7S0PS VIGE-XGLFX 
1 = 51 IYPLPEDPAI PMPPRQFKQL AAQGPQECLV R IVTv^i^r. o^n^nifrn 
1501 PYTKI.TffCT SVSPQDNYT? CT1£?VTT^ »t c gsSggg 

1651 DLLSKPSKIG ETWDLENRL LSXFGARCGL PQTYCVSGPN QWRDOLRPSO 
1701 LLHLFCQQHR VXAPVYR7DR VMF QDKEYSI e£iEAGRIPN 
1751 ALKVLQQQGL VPEHVESRPL YSPLQPDIEQ GXLQMWVDLF PKALGRPG? 3 
1301 FNI7PSSMS F.FLRCIXWNT RDVILDDLSL 7GEXKSDIYV KGWMIGF--H 
1851 KQX7DVHYRS LGGSGNFNWR FIFPFDYLPA EQVC7r.AXXD A^WRLDKT^S 
1901 XIPARWTQI WDNDXFSFDD FLGSLQLDL.M RMPKPAX7AX KCSLDQLDDA 
1951 FHPEWFVSLF EQK7VKGWWP CVAESGEXXI LAGXLEM7L" t V AESEH"R 
2001 PAGQGRDSPN MNPXLEDPRR PD7SFLWF7S PYX7MXFILW RRFRWAIIL* 
2051 IILFILLLFL AIFIYAFPNY AAMKI^r^c (SEQ ID NO:2) 

FIG. 2 
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SEQUENCE LISTING 
<110> The General Hospital Corporation 

<120> DYSFERLIN, A GENE MUTATED IN DISTAL MYOPATHY AND LIMB 
GIRDLE MUSCULAR DYSTROPHY 

<130> 00786/399WO2 

<150> US 60/097,927 
<151> 1998-08-25 

<160> 233 

<170> FastSEQ for Windows Version 3.0 

<210> 1 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (374)... (6613) 



60 
120 



457 



<400> 1 

tcqaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 
acccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 
qgcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 
acacgcgcca age atg ctg agg gtc ttc ate etc tat gec gag aac gtc 409 
Met Leu Arg Val Phe lie Leu Tyr Ala Glu Asn Val 
15 10 

cac aca ccc gac acc gac ate age gat gee tac tgc tec gcg gtg ttt 
His Thr Pro Asp Thr Asp lie Ser Asp Ala Tyr Cys Ser Ala Val Phe 
15 20 25 

aca ggg gtg aag aag aga acc aaa gtc ate aag aac age gtg aac cct 505 
Ala Gly Val Lys Lys Arg Thr Lys Val lie Lys Asn Ser Val Asn Pro 
30 35 40 

gta tgg aat gag gga ttt gaa tgg gac etc aag ggc ate ccc ctg gac 553 
Val Trp Asn Glu Gly Phe Glu Trp Asp Leu Lys Gly He Pro Leu Asp 
45 50 55 60 

cag ggc tct gag ctt cat gtg gtg gtc aaa gac cat gag acg atg ggg 
Gin Gly Ser Glu Leu His Val Val Val Lys Asp His Glu Thr Met Gly 
65 70 75 

agg aac agg ttc ctg ggg gaa gee aag gtc cca etc cga gag gtc etc 
Arg Asn Arg Phe Leu Gly Glu Ala Lys Val Pro Leu Arg Glu Val Leu 
80 85 90 

gee acc cct agt ctg tec gec age ttc aat gee ccc ctg ctg gac acc 
Ala Thr Pro Ser Leu Ser Ala Ser Phe Asn Ala Pro Leu Leu Asp Thr 
95 100 105 

aag aag cag ccc aca ggg gee teg ctg gtc ctg cag gtg tec tac aca 
Lvs Lys Gin Pro Thr Gly Ala Ser Leu Val Leu Gin Val Ser Tyr Thr 
110 H5 120 



601 



649 



697 



745 
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ccg ctg cct gga get gtg ccc ctg ttc ccg ccc cct act cct ctg gag 
Pro Leu Pro Gly Ala Val Pro Leu Phe Pro Pro Pro Thr Pro Leu Glu 
125 130 135 140 

ccc tec ccg act ctg cct gac ctg gat gta gtg gca gac aca gga gga 
Pro Ser Pro Thr Leu Pro Asp Leu Asp Val Val Ala Asp Thr Gly Gly 
145 150 155 

gag gaa gac aca gag gac cag gga etc act gga gat gag gcg gag cca 
Glu Glu Asp Thr Glu Asp Gin Gly Leu Thr Gly Asp Glu Ala Glu Pro 
160 165 170 

ttc ctg gat caa age gga ggc ccg ggg get ccc acc ace cca agg aaa 
Phe Leu Asp Gin Ser Gly Gly Pro Gly Ala Pro Thr Thr Pro Arg Lys 
175 180 185 

eta cct tea cgt cct ccg ccc cac tac ccc ggg ate aaa aga aag cga 
Leu Pro Ser Arg Pro Pro Pro His Tyr Pro Gly lie Lys Arg Lys Arg 
190 195 200 

agt gcg cct aca tct aga aag ctg ctg tea gac aaa ccg cag gat ttc 
Ser Ala Pro Thr Ser Arg Lys Leu Leu Ser Asp Lys Pro Gin Asp Phe 
205 210 215 220 

cag ate agg gtc cag gtg ate gag ggg cgc cag ctg ccg ggg gtg aac 
Gin lie Arg Val Gin Val lie Glu Gly Arg Gin Leu Pro Gly Val Asn 
225 230 235 

ate aag cct gtg gtc aag gtt acc get gca ggg cag acc aag egg acg 
lie Lys Pro Val Val Lys Val Thr Ala Ala Gly Gin Thr Lys Arg Thr 
240 245 250 



793 



841 



889 



937 



985 



1033 



1081 



1129 



egg ate cac aag gga aac age cca etc ttc aat gag act ctt ttc ttc 

Arg lie His Lys Gly Asn Ser Pro Leu Phe Asn Glu Thr Leu Phe Phe 
255 260 265 

aac ttg ttt gac tct cct ggg gag ctg ttt gat gag ccc ate ttt ate 

Asn Leu Phe Asp Ser Pro Gly Glu Leu Phe Asp Glu Pro lie Phe lie 

270 275 280 

acg gtg gta gac tct cgt tct etc agg aca gat get etc etc ggg gag 

Thr Val Val Asp Ser Arg Ser Leu Arg Thr Asp Ala Leu Leu Gly Glu 

285 290 295 300 

ttc egg atg gac gtg ggc acc att tac aga gag ccc egg cac gee tat 

Phe Arg Met Asp Val Gly Thr He Tyr Arg Glu Pro Arg His Ala Tyr 
305 310 315 



1177 



1225 



1273 



1321 



etc agg aag tgg ctg ctg etc tea gac cct gat gac ttc tct get ggg 
Leu Arg Lys Trp Leu Leu Leu Ser Asp Pro Asp Asp Phe Ser Ala Gly 
320 325 330 

gec aga ggc tac ctg aaa aca age ctt tgt gtg ctg ggg cct ggg gac 
Ala Arg Gly Tyr Leu Lys Thr Ser Leu Cys Val Leu Gly Pro Gly Asp 
335 340 345 

gaa gcg cct ctg gag aga aaa gac ccc tct gaa gac aag gag gac att 
Glu Ala Pro Leu Glu Arg Lys Asp Pro Ser Glu Asp Lys Glu Asp He 
350 355 360 

gaa age aac ctg etc egg ccc aca ggc gta gee ctg cga gga gee cac 
Glu Ser Asn Leu Leu Arg Pro Thr Gly Val Ala Leu Arg Gly Ala His 
365 370 375 380 

ttc tgc ctg aag gtc ttc egg gee gag gac ttg ccg cag atg gac gat 
Phe Cys Leu Lys Val Phe Arg Ala Glu Asp Leu Pro Gin Met Asp Asp 
385 390 395 



1369 



1417 



1465 



1513 



1561 
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gcc gtg atg gac aac gtg aaa cag ate ttt ggc ttc gag agt aac aag 
Ala Val Met Asp Asn Val Lys Gin He Phe Gly Phe Glu Ser Asn Lys 
400 405 410 

aag aac ttg gtg gac ccc ttt gtg gag gtc age ttt gcg ggg aaa atg 
LyI Asn Leu Val Asp Pro Phe Val Glu Val Ser Phe Ala Gly Lys Met 
415 420 425 

ctg tgc age aag ate ttg gag aag acg gcc aac cct cag tgg aac cag 
Leu Cys Ser Lys He Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin 
430 435 440 

aac ate aca ctg cct gcc atg ttt ccc tec atg tgc gaa aaa atg agg 
Asn He Thr Leu Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg 
445 450 455 460 

att cat ate ata gac tgg gac cgc ctg act cac aat gac ate gtg get 
He Arg He lie Asp Trp Asp Arg Leu Thr His Asn Asp He Val Ala 
465 470 475 

acc acc tac ctg agt atg teg aaa ate tct gcc cct gga gga gaa ata 
Thr Thr Tyr Leu Ser Met Ser Lys He Ser Ala Pro Gly Gly Glu He 
480 485 490 

gaa gag gag cct gca ggt get gtc aag cct teg aaa gcc tea gac ttg 
Ilu Glu Glu Pro Ala Gly Ala Val Lys Pro Ser Lys Ala Ser Asp Leu 
495 500 505 

gat gac tac ctg ggc ttc etc ccc act ttt ggg ccc tgc tac ate aac 
Asp Asp Tyr Leu Gly Phe Leu Pro Thr Phe Gly Pro Cys Tyr He Asn 
510 515 520 

etc tat ggc agt ccc aga gag ttc aca ggc ttc cca gac ccc tac aca 
Leu Tyr Gly Ser Pro Arg Glu Phe Thr Gly Phe Pro Asp Pro Tyr Thr 
525 530 535 540 

qag etc aac aca ggc aag ggg gaa ggt gtg get tat cgt ggc egg ctt 
Glu Leu Asn Thr Gly Lys Gly Glu Gly Val Ala Tyr Arg Gly Arg Leu 
545 550 555 

ctg etc tec ctg gag acc aag ctg gtg gag cac agt gaa cag aag gtg 
Leu Leu Ser Leu Glu Thr Lys Leu Val Glu His Ser Glu Gin Lys Val 
560 565 570 

aaa aac ctt cct gcg gat gac ate etc egg gtg gag aag tac ctt agg 
Glu Asp Leu Pro Ala Asp Asp He Leu Arg Val Glu Lys Tyr Leu Arg 
575 580 585 

agg cgc aag tac tec ctg ttt gcg gcc ttc tac tea gcc acc atg ctg 
Arq Arg Lys Tyr Ser Leu Phe Ala Ala Phe Tyr Ser Ala Thr Met Leu 
590 595 600 

caq qat gtg gat gat gcc ate cag ttt gag gtc age ate ggg aac tac 
Gin Asp Val Asp Asp Ala He Gin Phe Glu Val Ser He Gly Asn Tyr 
605 ~ 610 615 620 

qqg aac aag ttc gac atg acc tgc ctg ccg ctg gcc tec acc act cag 
Glv Asn Lys Phe Asp Met Thr Cys Leu Pro Leu Ala Ser Thr Thr Gin 
625 630 635 

tac aqc cgt gca gtc ttt gac ggg tgc cac tac tac tac eta ccc tgg 
Tyr Ser Arg Ala Val Phe Asp Gly Cys His Tyr Tyr Tyr Leu Pro Trp 
640 645 650 



1609 



1657 



1705 



1753 



1801 



1849 



1897 



1945 



1993 



2041 



2089 



2137 



2185 



2233 



2281 



2329 
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ggt aac gtg aaa cct gtg gtg gtg ctg tea tec tac tgg gag gac ate 
Gly Asn Val Lys Pro Val Val Val Leu Ser Ser Tyr Trp Glu Asp lie 
655 660 665 



gee ace cac ctg gac cag tac ctg tac cag ctg cgc acc cat cac ctg 
Ala Thr His Leu Asp Gin Tyr Leu Tyr Gin Leu Arg Thr His His Leu 
735 740 745 

age caa ate act gag get gee ctg gee ctg aag etc ggc cac agt gag 
Ser Gin lie Thr Glu Ala Ala Leu Ala Leu Lys Leu Gly His Ser Glu 
750 755 760 



gee ctg gca gag gag ccc cag aac age ctg ccg gac ate gtc ate tgg 
Ala Leu Ala Glu Glu Pro Gin Asn Ser Leu Pro Asp lie Val lie Trp 
785 790 795 



caa gtc etc ttc tec egg egg ggt gee aac tac tgt ggc aag aat tgt 
Gin Val Leu Phe Ser Arg Arg Gly Ala Asn Tyr Cys Gly Lys Asn Cys 
815 820 825 



tct gtg gat gag aag gag ttc aac cag ttt get gag ggg aag ctg tct 
Ser Val Asp Glu Lys Glu Phe Asn Gin Phe Ala Glu Gly Lys Leu Ser 
865 870 875 



aac tgg ggc aca acg ggc etc acc tac ccc aag ttt tct gac gtc acg 
Asn Trp Gly Thr Thr Gly Leu Thr Tyr Pro Lys Phe Ser Asp Val Thr 
895 900 905 



2377 



age cat aga ate gag act cag aac cag ctg ctt ggg att get gac egg 2425 
Ser His Arg He Glu Thr Gin Asn Gin Leu Leu Gly He Ala Asp Arg 
670 675 680 

ctg gaa get ggc ctg gag cag gtc cac ctg gee ctg aag gcg cag tgc 2473 
Leu Glu Ala Gly Leu Glu Gin Val His Leu Ala Leu Lys Ala Gin Cys 
685 690 695 700 

tec acg gag gac gtg gac teg ctg gtg get cag ctg acg gat gag etc 2521 
Ser Thr Glu Asp Val Asp Ser Leu Val Ala Gin Leu Thr Asp Glu Leu 
705 710 715 

ate gca ggc tgc age cag cct ctg ggt gac ate cat gag aca ccc tct 2569 
He Ala Gly Cys Ser Gin Pro Leu Gly Asp He His Glu Thr Pro Ser 
720 725 730 



2617 



2665 



etc cct gca get ctg gag cag gcg gag gac tgg etc ctg cgt ctg cgt 2713 
Leu Pro Ala Ala Leu Glu Gin Ala Glu Asp Trp Leu Leu Arg Leu Arg 
765 770 775 780 



2761 



atg ctg cag gga gac aag cgt gtg gca tac cag egg gtg ccc gee cac 2809 
Met Leu Gin Gly Asp Lys Arg Val Ala Tyr Gin Arg Val Pro Ala His 
800 805 810 



2857 



ggg aag eta cag aca ate ttt ctg aaa tat ccg atg gag aag gtg cct 2905 
Gly Lys Leu Gin Thr He Phe Leu Lys Tyr Pro Met Glu Lys Val Pro 
830 835 840 

ggc gec egg atg cca gtg cag ata egg gtc aag ctg tgg ttt ggg etc 2953 
Gly Ala Arg Met Pro Val Gin He Arg Val Lys Leu Trp Phe Gly Leu 
845 850 855 860 



3001 



gtc ttt get gaa acc tat gag aac gag act aag ttg gee ctt gtt ggg 3049 
Val Phe Ala Glu Thr Tyr Glu Asn Glu Thr Lys Leu Ala Leu Val Gly 
880 885 890 



3097 



ggc aag ate aag eta ccc aag gac age ttc cgc ccc teg gee ggc tgg 3145 
Gly Lys He Lys Leu Pro Lys Asp Ser Phe Arg Pro Ser Ala Gly Trp 
910 915 920 
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acc tag get gga gat tgg ttc gtg tgt ccg gag aag act ctg etc cat 3193 
Thr Trp Ala Gly Asp Trp Phe Val Cys Pro Glu Lys Thr Leu Leu His 
925 930 935 940 

aac atg gac gec ggt cac ctg age ttc gtg gaa gag gtg ttt gag aac 3241 
Asp Met Asp Ala Gly His Leu Ser Phe Val Glu Glu Val Phe Glu Asn 
* 945 950 955 

cag acc egg ctt ccc gga ggc cag tgg ate tac atg agt gac aac tac 3289 
Gin Thr Arg Leu Pro Gly Gly Gin Trp lie Tyr Met Ser Asp Asn Tyr 
960 965 970 

acc gat gtg aac ggg gag aag gtg ctt ccc aag gat gac att gag tgc 3337 
Thr Asp Val Asn Gly Glu Lys Val Leu Pro Lys Asp Asp He Glu Cys 
975 980 985 

cca ctg ggc tgg aag tgg gaa gat gag gaa tgg tec aca gac etc aac 3385 
Pro Leu Gly Trp Lys Trp Glu Asp Glu Glu Trp Ser Thr Asp Leu Asn 
990 995 1000 

egg get gtc gat gag caa ggc tgg gag tat age ate acc ate ccc ccg 3433 
Ara Ala Val Asp Glu Gin Gly Trp Glu Tyr Ser He Thr He Pro Pro 
1005 1010 1015 1020 

gag egg aag ccg aag cac tgg gtc cct get gag aag atg tac tac aca 3481 
Glu Ara Lys Pro Lys His Trp Val Pro Ala Glu Lys Met Tyr Tyr Thr 
1025 1030 1035 

cac cga egg egg cgc tgg gtg cgc ctg cgc agg agg gat etc age caa 3529 
His Arg Arg Arg Arg Trp Val Arg Leu Arg Arg Arg Asp Leu Ser Gin 
1040 1045 1050 

atg gaa gca ctg aaa agg cac agg cag gcg gag gcg gag ggc gag ggc 3577 
Met Glu Ala Leu Lys Arg His Arg Gin Ala Glu Ala Glu Gly Glu Gly 
1055 1060 1065 

tgg gag tac gee tct ctt ttt ggc tgg aag ttc cac etc gag tac cgc 3625 
Trp Glu Tyr Ala Ser Leu Phe Gly Trp Lys Phe His Leu Glu Tyr Arg 
1070 1075 1080 

aag aca gat gee ttc cgc cgc cgc cgc tgg cgc cgt cgc atg gag cca 3673 
Lys Thr Asp Ala Phe Arg Arg Arg Arg Trp Arg Arg Arg Met Glu Pro 
1085 1090 1095 1100 

ctg gag aag acg ggg cct gca get gtg ttt gee ctt gag ggg gee ctg 3721 
Leu Glu Lys Thr Gly Pro Ala Ala Val Phe Ala Leu Glu Gly Ala Leu 
1105 IHO 1115 



ggc ggc gtg atg gat gac aag agt gaa gat tec atg tec gtc tec acc 
Glv Glv Val Met Asp Asp Lys Ser Glu Asp Ser Met Ser Val Ser Thr 
1120 H25 1130 



3769 



ttg age ttc ggt gtg aac aga ccc acg att tec tgc ata ttc gac tat 3817 
Leu Ser Phe Gly Val Asn Arg Pro Thr He Ser Cys He Phe Asp Tyr 
1135 1140 1145 

ggg aac cgc tac cat eta cgc tgc tac atg tac cag gee egg gac ctg 
Gly Asn Arg Tyr His Leu Arg Cys Tyr Met Tyr Gin Ala Arg Asp Leu 
1150 H55 H60 



3865 



get gcg atg gac aag gac tct ttt tct gat ccc tat gee ate gtc tec 3913 
Ala Ala Met Asp Lys Asp Ser Phe Ser Asp Pro Tyr Ala He Val Ser 
1165 H70 H75 1180 
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ttc ctg cac cag age cag aag acg gtg gtg gtg aag aac acc ctt aac 3961 
Phe Leu His Gin Ser Gin Lys Thr Val Val Val Lys Asn Thr Leu Asn 
1185 1190 1195 

ccc acc tgg gac cag acg etc ate ttc tac gag ate gag ate ttt ggc 4009 
Pro Thr Trp Asp Gin Thr Leu lie Phe Tyr Glu lie Glu He Phe Gly 
1200 1205 1210 

gag ccg gec aca gtt get gag caa ccg ccc age att gtg gtg gag ctg 4057 
Glu Pro Ala Thr Val Ala Glu Gin Pro Pro Ser He Val Val Glu Leu 
1215 1220 1225 

tac gac cat gac act tat ggt gca gac gag ttt atg ggt cgc tgc ate 4105 
Tyr Asp His Asp Thr Tyr Gly Ala Asp Glu Phe Met Gly Arg Cys He 
1230 1235 1240 

tgt caa ccg agt ctg gaa egg atg cca egg ctg gee tgg ttc cca ctg 4153 
Cys Gin Pro Ser Leu Glu Arg Met Pro Arg Leu Ala Trp Phe Pro Leu 
1245 1250 1255 1260 

acg agg ggc age cag ccg teg ggg gag ctg ctg gec tct ttt gag etc 4201 
Thr Arg Gly Ser Gin Pro Ser Gly Glu Leu Leu Ala Ser Phe Glu Leu 
1265 1270 1275 

ate cag aga gag aag ccg gee ate cac cat att cct ggt ttt gag gtg 4249 
He Gin Arg Glu Lys Pro Ala He His His He Pro Gly Phe Glu Val 
1280 1285 1290 

cag gag aca tea agg ate ctg gat gag tct gag gac aca gac ctg ccc 4297 
Gin Glu Thr Ser Arg He Leu Asp Glu Ser Glu Asp Thr Asp Leu Pro 
1295 1300 1305 

tac cca cca ccc cag agg gag gee aac ate tac atg gtt cct cag aac 4345 
Tyr Pro Pro Pro Gin Arg Glu Ala Asn He Tyr Met Val Pro Gin Asn 
1310 1315 1320 

ate aag cca gcg etc cag cgt acc gee ate gag ate ctg gca tgg ggc 4393 
He Lys Pro Ala Leu Gin Arg Thr Ala He Glu He Leu Ala Trp Gly 
1325 1330 1335 1340 

ctg egg aac atg aag agt tac cag ctg gee aac ate tec tec ccc age 4441 
Leu Arg Asn Met Lys Ser Tyr Gin Leu Ala Asn He Ser Ser Pro Ser 
1345 1350 1355 

etc gtg gta gag tgt ggg ggc cag acg gtg cag tec tgt gtc ate agg 4489 
Leu Val Val Glu Cys Gly Gly Gin Thr Val Gin Ser Cys Val He Arg 
1360 1365 1370 

aac etc egg aag aac ccc aac ttt gac ate tgc acc etc ttc atg gaa 4537 
Asn Leu Arg Lys Asn Pro Asn Phe Asp He Cys Thr Leu Phe Met Glu 
1375 1380 1385 



gtg atg ctg ccc agg gag gag etc tac tgc ccc ccc ate acc gtc aag 
Val Met Leu Pro Arg Glu Glu Leu Tyr Cys Pro Pro He Thr Val Lys 
1390 " 1395 1400 



4585 



gtc ate gat aac cgc cag ttt ggc cgc egg cct gtg gtg ggc cag tgt 4633 
Val He Asp Asn Arg Gin Phe Gly Arg Arg Pro Val Val Gly Gin Cys 
1405 1410 1415 1420 

acc ate cgc tec ctg gag age ttc ctg tgt gac ccc tac teg gcg gag 4681 
Thr He Arg Ser Leu Glu Ser Phe Leu Cys Asp Pro Tyr Ser Ala Glu 
1425 1430 1435 

agt cca tec cca cag ggt ggc cca gac gat gtg age eta etc agt cct 4729 
Ser Pro Ser Pro Gin Gly Gly Pro Asp Asp Val Ser Leu Leu Ser Pro 
1440 ' 1445 1450 
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ggg gaa gac gtg etc ate gac att gat gac aag gag ccc etc ate ccc 
lly llu Asp Val Leu He Asp He Asp Asp Lys Glu Pro Leu He Pro 
* 14 55 1460 1465 

ate cao qaq gaa gag ttc ate gat tgg tgg age aaa ttc ttt gee tec 
He Gin Glu Glu Glu Phe He Asp Trp Trp Ser Lys Phe Phe Ala Ser 
1470 1475 1480 

ata ggg gag agg gaa aag tgc ggc tec tac ctg gag aag gat ttt gac 
He Gly Glu Arg Glu Lys Cys Gly Ser Tyr Leu Glu Lys Asp Phe Asp 
1485 1490 1495 1500 

ace ctg aag gtc tat gac aca cag ctg gag aat gtg gag gee ttt gag 
Thr Leu Lys Val Tyr Asp Thr Gin Leu Glu Asn Val Glu Ala Phe Glu 
1505 1510 1515 

age ctg tct gac ttt tgt aac acc ttc aag ctg tac egg ggc aag acg 
Gly Leu Ser Asp Phe Cys Asn Thr Phe Lys Leu Tyr Arg Gly Lys Thr 
15 20 1525 1530 

cag gag gag aca gaa gat cca tct gtg att ggt gaa ttt aag ggc etc 
Gin Glu Glu Thr Glu Asp Pro Ser Val He Gly Glu Phe Lys Gly Leu 
1535 1540 1545 

ttc aaa att tat ccc etc cca gaa gac cca gee ate ccc atg ccc cca 
Phe Lvs He Tyr Pro Leu Pro Glu Asp Pro Ala He Pro Met Pro Pro 
lt 50 1555 1560 

aga cag ttc cac cag ctg gee gee cag gga ccc cag gag tgc ttg gtc 
Arg Gin Phe His Gin Leu Ala Ala Gin Gly Pro Gin Glu Cys Leu Val 
15 65 1570 1575 1580 

cat ate tac att gtc cga gca ttt ggc ctg cag ccc aag gac ccc aat 
Ara He Tyr He Val Arg Ala Phe Gly Leu Gin Pro Lys Asp Pro Asn 
1585 1590 1595 

gga aag tgt gat cct tac ate aag ate tec ata ggg aag aaa tea gtg 
Gly Lys Cys Asp Pro Tyr He Lys He Ser He Gly Lys Lys Ser Val 
1600 1605 1610 

agt gac cag gat aac tac ate ccc tgc acg ctg gag ccc gta ttt gga 
Ser Asp Gin Asp Asn Tyr He Pro Cys Thr Leu Glu Pro Val Phe Gly 
1615 1620 1625 

aag atg ttc gag ctg acc tgc act ctg cct ctg gag aag gac eta aag 
Lvs Met Phe Glu Leu Thr Cys Thr Leu Pro Leu Glu Lys Asp Leu Lys 
1630 1635 1640 

ate act etc tat gac tat gac etc etc tec aag gac gaa aag ate ggt 
He Thr Leu Tyr Asp Tyr Asp Leu Leu Ser Lys Asp Glu Lys He Gly 
1645 " 1650 1655 1660 

gag acg gtc gtc gac ctg gag aac agg ctg ctg tec aag ttt ggg get 
Glu Thr Val Val Asp Leu Glu Asn Arg Leu Leu Ser Lys Phe Gly Ala 
1665 1670 1675 

tgt gga etc cca cag acc tac tgt gtc tct gga ccg aac cag tgg 
Arg Cys Gly Leu Pro Gin Thr Tyr Cys Val Ser Gly Pro Asn Gin Trp 
1680 1685 1690 

egg gac cag etc cgc ccc tec cag etc etc cac etc ttc tgc cag cag 
Ara Asp Gin Leu Arg Pro Ser Gin Leu Leu His Leu Phe Cys Gin Gin 
1695 1700 1705 



4777 



4825 



4873 



4921 



4969 



5017 



5065 



5113 



5161 



5209 



5257 



5305 



5353 



5401 



5449 



5497 
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5545 



cat aga gtc aag gca cct gtg tac egg aca gac cgt gta atg ttt cag 
His Arg Val Lys Ala Pro Val Tyr Arg Thr Asp Arg Val Met Phe Gin 
1710 1715 1720 

gat aaa gaa tat tec att gaa gag ata gag get ggc agg ate cca aac 
Asp Lys Glu Tyr Ser He Glu Glu He Glu Ala Gly Arg He Pro Asn 
1725 1730 1735 1740 

cca cac ctg ggc cca gtg gag gag cgt ctg get ctg cat gtg ctt cag 
Pro His Leu Gly Pro Val Glu Glu Arg Leu Ala Leu His Val Leu Gin 
1745 1750 1755 

cag cag ggc ctg gtc ccg gag cac gtg gag tea egg ccc etc tac age 
Gin Gin Gly Leu Val Pro Glu His Val Glu Ser Arg Pro Leu Tyr Ser 
1760 1765 1770 



5593 



5641 



5689 



ccc ctg cag cca gac ate gag cag ggg aag ctg cag atg tgg gtc gac 5737 
Pro Leu Gin Pro Asp He Glu Gin Gly Lys Leu Gin Met Trp Val Asp 
1775 1780 1785 

eta ttt ccg aag gee ctg ggg egg cct gga cct ccc ttc aac ate acc 5785 
Leu Phe Pro Lys Ala Leu Gly Arg Pro Gly Pro Pro Phe Asn He Thr 
1790 1795 1800 

cca egg aga gee aga agg ttt ttc ctg cgt tgt att ate tgg aat acc 5833 
Pro Arq Arg Ala Arg Arg Phe Phe Leu Arg Cys He He Trp Asn Thr 
1805 1810 1815 1820 

aga gat gtg ate ctg gat gac ctg age etc acg ggg gag aag atg age 
Ara Asp Val He Leu Asp Asp Leu Ser Leu Thr Gly Glu Lys Met Ser 
1825 1830 1835 

gac att tat gtg aaa ggt tgg atg att ggc ttt gaa gaa cac aag caa 
Asp He Tyr Val Lys Gly Trp Met He Gly Phe Glu Glu His Lys Gin 
1840 1845 1850 

aag aca gac gtg cat tat cgt tec ctg gga ggt gaa ggc aac ttc aac 
Lys Thr Asp Val His Tyr Arg Ser Leu Gly Gly Glu Gly Asn Phe Asn 
1855 I860 1865 

tgg agg ttc att ttc ccc ttc gac tac ctg cca get gag caa gtc tgt 
Trp Arg Phe He Phe Pro Phe Asp Tyr Leu Pro Ala Glu Gin Val Cys 
1870 1875 1880 

acc att gec aag aag gat gee ttc tgg agg ctg gac aag act gag age 
Thr He Ala Lys Lys Asp Ala Phe Trp Arg Leu Asp Lys Thr Glu Ser 
1885 1890 1895 1900 

aaa ate cca gca cga gtg gtg ttc cag ate tgg gac aat gac aag ttc 
Lvs He Pro Ala Arg Val Val Phe Gin He Trp Asp Asn Asp Lys Phe 
y 1905 1910 1915 

tec ttt gat gat ttt ctg ggc tec ctg cag etc gat etc aac cgc atg 
Ser Phe Asp Asp Phe Leu Gly Ser Leu Gin Leu Asp Leu Asn Arg Met 
1920 1925 1930 

ccc aag cca gee aag aca gee aag aag tgc tec ttg gac cag ctg gat 6217 
Pro Lys Pro Ala Lys Thr Ala Lys Lys Cys Ser Leu Asp Gin Leu Asp 
1935 1940 1945 

gat get ttc cac cca gaa tgg ttt gtg tec ctt ttt gag cag aaa aca 
Asp Ala Phe His Pro Glu Trp Phe Val Ser Leu Phe Glu Gin Lys Thr 
1950 1955 I960 

crtq aag ggc tgg tgg ccc tgt gta gca gaa gag ggt gag aag aaa ata 6313 
Val Lvs Glv Trp Trp Pro Cys Val Ala Glu Glu Gly Glu Lys Lys He 
1965 1970 1975 1980 



5881 



5929 



5977 



602 5 



6073 



6121 



6169 



6265 
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ctq gcg ggc aag ctg gaa atg acc ttg gag att gta gca gag agt gag 6361 
Leu Ala Gly Lys Leu Glu Met Thr Leu Glu lie Val Ala Glu Ser Glu 
1985 1990 1995 



cat gag gag egg cct get ggc cag ggc egg gat gag ccc aac atg aac 6409 
His Glu Glu Arg Pro Ala Gly Gin Gly Arg Asp Glu Pro Asn Met Asn 
2000 2005 2010 

cct aag ctt gag gac cca agg cgc ccc gac acc tec ttc ctg tgg ttt 6457 
Pro Lys Leu Glu Asp Pro Arg Arg Pro Asp Thr Ser Phe Leu Trp Phe 
2015 2020 2025 

acc tec cca tac aag acc atg aag ttc ate ctg tgg egg cgt ttc egg 6505 
Thr Ser Pro Tyr Lys Thr Met Lys Phe He Leu Trp Arg Arg Phe Arg 
2030 2035 2040 

tea acc ate ate etc ttc ate ate etc ttc ate ctg ctg ctg ttc ctg 6553 
Trp Ala He He Leu Phe He He Leu Phe He Leu Leu Leu Phe Leu 
2045 2050 2055 2060 

qcc ate ttc ate tac gee ttc ccg aac tat get gee atg aag ctg gtg 
Ala He Phe He Tyr Ala Phe Pro Asn Tyr Ala Ala Met Lys Leu Val 
2065 2070 2075 

aag ccc ttc age tgaggactct cctgccctgt agaaggggee gtggggtccc 
Lys Pro Phe Ser 
2080 

ctccagcatg ggactggcct gcctcctccg cccagctcgg cgagctcctc cagacctcct 6713 
aggectgatt gtcctgccag ggtgggcaga cagacagatg gaccggccca cactcccaga 6773 
gttgetaaca tggagctctg agatcacccc acttccatca tttccttctc ccccaaccca 6833 
aegctttttt ggatcagctc agacatattt cagtataaaa cagttggaac cacaaaaaaa 6893 
aaaaaaaaaa aaaaaaaa 



6601 



6653 
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Ser Arg Lys Leu Leu Ser Asp Lys Pro Gin Asp Phe Gin He Arg Val 

210 215 220 

Gin Val He Glu Gly Arg Gin Leu Pro Gly Val Asn He Lys Pro Val 
225 230 235 240 

Val Lys Val Thr Ala Ala Gly Gin Thr Lys Arg Thr Arg He His Lys 

245 " 250 255 

Gly Asn Ser Pro Leu Phe Asn Glu Thr Leu Phe Phe Asn Leu Phe Asp 

260 265 270 

Ser Pro Gly Glu Leu Phe Asp Glu Pro He Phe He Thr Val Val Asp 

275 280 285 

Ser Arg Ser Leu Arg Thr Asp Ala Leu Leu Gly Glu Phe Arg Met Asp 

290 295 300 

Val Gly Thr He Tyr Arg Glu Pro Arg His Ala Tyr Leu Arg Lys Trp 
305 310 315 320 

Leu Leu Leu Ser Asp Pro Asp Asp Phe Ser Ala Gly Ala Arg Gly Tyr 

325 330 335 

Leu Lys Thr Ser Leu Cys Val Leu Gly Pro Gly Asp Glu Ala Pro Leu 

340 345 350 

Glu Arg Lys Asp Pro Ser Glu Asp Lys Glu Asp He Glu Ser Asn Leu 

355 360 365 

Leu Arg Pro Thr Gly Val Ala Leu Arg Gly Ala His Phe Cys Leu Lys 

370 " 375 380 

Val Phe Arg Ala Glu Asp Leu Pro Gin Met Asp Asp Ala Val Met Asp 
385 " 390 395 400 

Asn Val Lys Gin He Phe Gly Phe Glu Ser Asn Lys Lys Asn Leu Val 

405 " 410 415 

Asp Pro Phe Val Glu Val Ser Phe Ala Gly Lys Met Leu Cys Ser Lys 

420 425 430 

He Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin Asn He Thr Leu 

435 440 445 

Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg He Arg He He 

450 455 460 

Asp Trp Asp Arg Leu Thr His Asn Asp He Val Ala Thr Thr Tyr Leu 
465 470 475 480 

Ser Met Ser Lys He Ser Ala Pro Gly Gly Glu He Glu Glu Glu Pro 

485 490 495 

Ala Gly Ala Val Lys Pro Ser Lys Ala Ser Asp Leu Asp Asp Tyr Leu 

500 505 510 

Gly Phe Leu Pro Thr Phe Gly Pro Cys Tyr He Asn Leu Tyr Gly Ser 

515 520 525 

Pro Arg Glu Phe Thr Gly Phe Pro Asp Pro Tyr Thr Glu Leu Asn Thr 

530 535 540 

Gly Lys Gly Glu Gly Val Ala Tyr Arg Gly Arg Leu Leu Leu Ser Leu 
545 550 555 560 

Glu Thr Lys Leu Val Glu His Ser Glu Gin Lys Val Glu Asp Leu Pro 

565 570 575 

Ala Asp Asp He Leu Arg Val Glu Lys Tyr Leu Arg Arg Arg Lys Tyr 

580 ~ 585 590 

Ser Leu Phe Ala Ala Phe Tyr Ser Ala Thr Met Leu Gin Asp Val Asp 

595 600 605 

Asp Ala He Gin Phe Glu Val Ser He Gly Asn Tyr Gly Asn Lys Phe 

610 615 620 

Asp Met Thr Cys Leu Pro Leu Ala Ser Thr Thr Gin Tyr Ser Arg Ala 
625 630 635 640 

Val Phe Asp Gly Cys His Tyr Tyr Tyr Leu Pro Trp Gly Asn Val Lys 

645 "* 650 655 

Pro Val Val Val Leu Ser Ser Tyr Trp Glu Asp He Ser His Arg He 

660 665 670 

Glu Thr Gin Asn Gin Leu Leu Gly He Ala Asp Arg Leu Glu Ala Gly 

675 680 685 

Leu Glu Gin Val His Leu Ala Leu Lys Ala Gin Cys Ser Thr Glu Asp 

690 695 700 

Val Asp Ser Leu Val Ala Gin Leu Thr Asp Glu Leu He Ala Gly Cys 
705 710 715 720 

Ser Gin Pro Leu Gly Asp He His Glu Thr Pro Ser Ala Thr His Leu 
725 730 735 
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Asp Gin Tyr Leu Tyr Gin Leu Arg Thr His His Leu Ser Gin lie Thr 

740 745 750 

Glu Ala Ala Leu Ala Leu Lys Leu Gly His Ser Glu Leu Pro Ala Ala 

755 760 765 

Leu Glu Gin Ala Glu Asp Trp Leu Leu Arg Leu Arg Ala Leu Ala Glu 

770 775 780 

Glu Pro Gin Asn Ser Leu Pro Asp lie Val He Trp Met Leu Gin Gly 
785 790 795 800 

Asp Lys Arg Val Ala Tyr Gin Arg Val Pro Ala His Gin Val Leu Phe 

** 805 810 815 

Ser Arg Arg Gly Ala Asn Tyr Cys Gly Lys Asn Cys Gly Lys Leu Gin 

820 825 830 

Thr lie Phe Leu Lys Tyr Pro Met Glu Lys Val Pro Gly Ala Arg Met 

835 840 845 

Pro Val Gin He Arg Val Lys Leu Trp Phe Gly Leu Ser Val Asp Glu 

850 855 860 

Lys Glu Phe Asn Gin Phe Ala Glu Gly Lys Leu Ser Val Phe Ala Glu 
865 870 875 880 

Thr Tyr Glu Asn Glu Thr Lys Leu Ala Leu Val Gly Asn Trp Gly Thr 

885 890 895 

Thr Gly Leu Thr Tyr Pro Lys Phe Ser Asp Val Thr Gly Lys He Lys 

900 905 910 

Leu Pro Lys Asp Ser Phe Arg Pro Ser Ala Gly Trp Thr Trp Ala Gly 

915 920 925 

Asp Trp Phe Val Cys Pro Glu Lys Thr Leu Leu His Asp Met Asp Ala 

930 935 940 

Gly His Leu Ser Phe Val Glu Glu Val Phe Glu Asn Gin Thr Arg Leu 
945 950 955 960 

Pro Glv Glv Gin Trp He Tyr Met Ser Asp Asn Tyr Thr Asp Val Asn 

1 J 965 970 975 

Gly Glu Lys Val Leu Pro Lys Asp Asp He Glu Cys Pro Leu Gly Trp 

980 985 990 

Lys Trp Glu Asp Glu Glu Trp Ser Thr Asp Leu Asn Arg Ala Val Asp 

995 1000 1005 

Glu Gin Gly Trp Glu Tyr Ser He Thr He Pro Pro Glu Arg Lys Pro 

1010 1015 1020 

Lys His Trp Val Pro Ala Glu Lys Met Tyr Tyr Thr His Arg Arg Arg 
1025 1030 1035 1040 

Arg Trp Val Arg Leu Arg Arg Arg Asp Leu Ser Gin Met Glu Ala Leu 

1045 ** 1050 1055 

Lys Arg His Arg Gin Ala Glu Ala Glu Gly Glu Gly Trp Glu Tyr Ala 

1060 1065 1070 

Ser Leu Phe Gly Trp Lys Phe His Leu Glu Tyr Arg Lys Thr Asp Ala 

1075 1080 1085 

Phe Arg Arg Arg Arg Trp Arg Arg Arg Met Glu Pro Leu Glu Lys Thr 

1090 ~ ^ 1095 1100 

Glv Pro Ala Ala Val Phe Ala Leu Glu Gly Ala Leu Gly Gly Val Met 
1105 HIO 1H5 1120 

Asp Asp Lys Ser Glu Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly 

1125 H30 1135 

Val Asn Arg Pro Thr He Ser Cys He Phe Asp Tyr Gly Asn Arg Tyr 

1140 H45 1150 

His Leu Arg Cys Tyr Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp 

1155 H60 1165 

Lys Asp Ser Phe Ser Asp Pro Tyr Ala He Val Ser Phe Leu His Gin 

1170 H75 1180 

Ser Gin Lys Thr Val Val Val Lys Asn Thr Leu Asn Pro Thr Trp Asp 
1185 H90 1195 1200 

Gin Thr Leu He Phe Tyr Glu He Glu He Phe Gly Glu Pro Ala Thr 

1205 1210 1215 

Val Ala Glu Gin Pro Pro Ser He Val Val Glu Leu Tyr Asp His Asp 

1220 1225 1230 

Thr Tyr Gly Ala Asp Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser 

1235 1240 1245 

Leu Glu Arg Met Pro Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser 
1250 1255 1260 
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Gin Pro Ser Gly Glu Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu 
1265 ~ 1270 1275 1280 

Lye Pro Ala He His His lie Pro Gly Phe Glu Val Gin Glu Thr Ser 

1285 1290 1295 

Arg He Leu Asp Glu Ser Glu Asp Thr Asp Leu Pro Tyr Pro Pro Pro 

1300 1305 1310 

Gin Arg Glu Ala Asn He Tyr Met Val Pro Gin Asn He Lys Pro Ala 

1315 1320 1325 

Leu Gin Arg Thr Ala He Glu He Leu Ala Trp Gly Leu Arg Asn Met 

1330 1335 1340 

Lys Ser Tyr Gin Leu Ala Asn He Ser Ser Pro Ser Leu Val Val Glu 
1345 1350 1355 1360 

Cys Gly Gly Gin Thr Val Gin Ser Cys Val He Arg Asn Leu Arg Lys 
212 1365 i3 70 1375 

Asn Pro Asn Phe Asp He Cys Thr Leu Phe Met Glu Val Met Leu Pro 

1380 1385 1390 

Arg Glu Glu Leu Tyr Cys Pro Pro He Thr Val Lys Val He Asp Asn 

1395 1400 1405 

Arg Gin Phe Gly Arg Arg Pro Val Val Gly Gin Cys Thr He Arg Ser 

1410 1415 1420 

Leu Glu Ser Phe Leu Cys Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro 
1425 1430 1435 1440 

Gin Gly Gly Pro Asp Asp Val Ser Leu Leu Ser Pro Gly Glu Asp Val 

1445 1450 1455 

Leu He Asp He Asp Asp Lys Glu Pro Leu He Pro He Gin Glu Glu 

1460 1465 1470 

Glu Phe He Asp Trp Trp Ser Lys Phe Phe Ala Ser He Gly Glu Arg 

1475 1480 1485 

Glu Lys Cys Gly Ser Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val 

1490 M 1495 1500 

Tyr Asp Thr Gin Leu Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp 
1505 1510 1515 1520 

Phe CyB Asn Thr Phe Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr 

1525 1530 1535 

Glu Asp Pro Ser Val He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr 

1540 1545 1550 

Pro Leu Pro Glu Asp Pro Ala He Pro Met Pro Pro Arg Gin Phe His 

1555 1560 1565 

Gin Leu Ala Ala Gin Gly Pro Gin Glu Cys Leu Val Arg He Tyr lie 

1570 1575 1580 

Val Arg Ala Phe Gly Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp 
1585 1590 1595 1600 

Pro Tyr He Lys He Ser He Gly Lys Lys Ser Val Ser Asp Gin Asp 

1605 1610 1615 

Asn Tyr He Pro Cys Thr Leu Glu Pro Val Phe Gly Lys Met Phe Glu 

1620 1625 1630 

Leu Thr Cys Thr Leu Pro Leu Glu Lys Asp Leu Lys He Thr Leu Tyr 

1635 1640 1645 

Asp Tyr Asp Leu Leu Ser Lys Asp Glu Lys He Gly Glu Thr Val Val 

1650 1655 1660 

Asp Leu Glu Asn Arg Leu Leu Ser Lys Phe Gly Ala Arg Cys Gly Leu 
1665 1670 1675 1680 

Pro Gin Thr Tyr Cys Val Ser Gly Pro Asn Gin Trp Arg Asp Gin Leu 

1685 1690 1695 

Arg Pro Ser Gin Leu Leu His Leu Phe Cys Gin Gin His Arg Val Lys 

1700 1705 1710 

Ala Pro Val Tyr Arg Thr Asp Arg Val Met Phe Gin Asp Lys Glu Tyr 

1715 1720 1725 

Ser He Glu Glu He Glu Ala Gly Arg He Pro Asn Pro His Leu Gly 

1730 1735 1740 

Pro Val Glu Glu Arg Leu Ala Leu His Val Leu Gin Gin Gin Gly Leu 
1745 1750 1755 1760 

Val Pro Glu His Val Glu Ser Arg Pro Leu Tyr Ser Pro Leu Gin Pro 

1765 1770 1775 

Asp He Glu Gin Gly Lys Leu Gin Met Trp Val Asp Leu Phe Pro Lys 
1780 1785 1790 
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Ala Leu Gly Arg Pro Gly Pro Pro Phe Asn lie Thr Pro Arg Arg Ala 

1795 1800 1805 

Arg Arg Phe Phe Leu Arg Cys He He Trp Asn Thr Arg Asp Val He 

1810 1815 1820 

Leu Asp Asp Leu Ser Leu Thr Gly Glu Lys Met Ser Asp He Tyr Val 
1825 1830 1835 J.»<*u 

Lys Gly Trp Met He Gly Phe Glu Glu His Lys Gin Lys Thr Asp Val 

1845 1850 1855 

His Tyr Arg Ser Leu Gly Gly Glu Gly Asn Phe Asn Trp Arg Phe He 

I860 1865 1B7U 

Phe Pro Phe Asp Tyr Leu Pro Ala Glu Gin Val Cys Thr He Ala Lys 

1875 1880 1885 

Lys Asp Ala Phe Trp Arg Leu Asp Lys Thr Glu Ser Lys He Pro Ala 

y 1890 1895 1900 

Arg Val Val Phe Gin He Trp Asp Asn Asp Lys Phe Ser Phe Asp Asp 

1905 191° 1915 i? 

Phe Leu Gly Ser Leu Gin Leu Asp Leu Asn Arg Met Pro Lys Pro Ala 

1925 1930 lyjs 

Lvs Thr Ala Lys Lys Cys Ser Leu Asp Gin Leu Asp Asp Ala Phe His 

1940 1945 1950 

Pro Glu Trp Phe Val Ser Leu Phe Glu Gin Lys Thr Val Lys Gly Trp 

1955 1960 1965 

Trp Pro Cys Val Ala Glu Glu Gly Glu Lys Lys I If Leu Ala Gly Lys 

1970 1975 1980 

Leu Glu Met Thr Leu Glu He Val Ala Glu Ser Glu His Glu Glu Arg 
1985 1990 1995 2000 

Pro Ala Gly Gin Gly Arg Asp Glu Pro Asn Met Asn Pro Lys Leu Glu 
" ~ 2005 2010 2 °1 5 

Asp Pro Arg Arg Pro Asp Thr Ser Phe Leu Trp Phe Thr Ser Pro Tyr 

2020 2025 ^uju 

Lvs Thr Met Lys Phe He Leu Trp Arg Arg Phe Arg Trp Ala He He 

2 2035 2040 2045 

Leu Phe He He Leu Phe He Leu Leu Leu Phe Leu Ala He Phe He 

2050 2055 2060 

Tyr Ala Phe Pro Asn Tyr^Ala Ala Met Lys Leu^Val Lys Pro Phe Ser^ 

<210> 3 
<211> 5915 
<212> DNA 

<213> Homo sapiens 



<400> 3 

tcgaccgccc agccaggtgc aaaatgccgt 
agattacagc tcgacggagc tcgggaaggg 
tgttctcgga acgccggctg acaagcgggg 
gcccactgga gcagccgggg gtggcccgtt 
agccagagat tcgagccggc ctcgcccagc 
ggcgcctcgg ccctcccgac ctttccgagc 
acacgcgcca agcatgctga gggtcttcat 
caccgacatc agcgatgcct actgctccgc 
agtcatcaag aacagcgtga accctgtatg 
catccccctg gaccagggct ctgagcttca 
gaggaacagg ttcctggggg aagccaaggt 
tctgtccgcc agcttcaatg cccccctgct 
gctggtcctg caggtgtcct acacaccgct 
tactcctctg gagccctccc cgactctgcc 
agaggaagac acagaggacc agggactcac 
aagcggaggc ccgggggctc ccaccacccc 
ctaccccggg atcaaaagaa agcgaagtgc 
accgcaggat ttccagatca gggtccaggt 
catcaagcct gtggtcaagg ttaccgctgc 
gggaaacagc ccactcttca atgagactct 
gctgtttgat gagcccatct ttatcacggt 
tctcctcggg gagttccgga tggacgtggg 
tctcaggaag tggctgctgc tctcagaccc 
cctgaaaaca agcctttgtg tgctggggcc 



gtcattggga 

cggcgggggt 

tgagcgcagg 

cccctttaag 

cagccctctc 

cctctttgcg 

cctctatgcc 

ggtgtttgca 

gaatgaggga 

tgtggtggtc 

cccactccga 

ggacaccaag 

gcctggagct 

tgacctggat 

tggagatgag 

aaggaaacta 

gcctacatct 

gatcgagggg 

agggcagacc 

tttcttcaac 

ggtagactct 

caccatttac 

tgatgacttc 

tggggacgaa 



gactccgcag 

ggaagatgag 

cggggcgggg 

agcaactgct 

cagcgagggg 

ccctgggcgc 

gagaacgtcc 

ggggtgaaga 

tttgaatggg 

aaagaccatg 

gaggtcctcg 

aagcagccca 

gtgcccctgt 

gtagtggcag 

gcggagccat 

ccttcacgtc 

agaaagctgc 

cgccagctgc 

aagcggacgc 

ttgtttgact 

cgttctctca 

agagagcccc 

tctgctgggg 

gcgcctctgg 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaggg 

agacgatggg 

ccacccctag 

caggggcctc 

tcccgccccc 

acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaact/tggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtttgc 2160 

ggccttctac tcagccacca tgctgcagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctrttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgctgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatctttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

cggggagaag gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 

acaccgacgg cggcgctggg tgcgcctgcg caggagggat ctcagccaaa tggaagcact 3540 

gaaaaggcac aggcaggcgg aggcggaggg cgagggctgg gagtacgcct ctctttttgg 3600 

ctggaagttc cacctcgagt accgcaagac agatgccttc cgccgccgcc gctggcgccg 3660 

tcgcatggag ccactggaga agacggggcc tgcagctgtg tttgcccttg agggggccct 3720 

gggcggcgtg atggatgaca agagtgaaga ttccatgtcc gtctccacct tgagcttcgg 3780 

tgtgaacaga cccacgattt cctgcatatt cgactatggg aaccgctacc atctacgctg 3840 

crtacatgtac caggcccggg acctggctgc gatggacaag gactcttttt ctgatcccta 3900 

tgccatcgtc tccttcctgc accagagcca gaagacggtg gtggtgaaga acacccttaa 3960 

ccccacctgg gaccagacgc tcatcttcta cgagatcgag atctttggcg agccggccac 4020 

agttgctgag caaccgccca gcattgtggt ggagctgtac gaccatgaca cttatggtgc 4080 

agacgagttt atgggtcgct gcatctgtca accgagtctg gaacggatgc cacggctggc 4140 

ctggttccca ctgacgaggg gcagccagcc gtcgggggag ctgctggcct cttttgagct 4200 

catccagaga gagaagccgg ccatccacca tattcctggt tttgaggtgc aggagacatc 4260 

aaggatcctg gatgagtctg aggacacaga cctgccctac ccaccacccc agagggaggc 4320 

caacatctac atggttcctc agaacatcaa gccagcgctc cagcgtaccg ccatcgagat 4380 

cctggcatgg ggcctgcgga acatgaagag ttaccagctg gccaacatct cctcccccag 4440 

cctcgtggta gagtgtgggg gccagacggt gcagtcctgt gtcatcagga acctccggaa 4500 

gaaccccaac tttgacatct gcaccctctt catggaagtg atgctgccca gggaggagct 4560 

ctactgcccc cccatcaccg tcaaggtcat cgataaccgc cagtttggcc gccggcctgt 4620 

ggtgggccag tgtaccatcc gctccctgga gagcttcctg tgtgacccct actcggcgga 4680 

gagtccatcc ccacagggtg gcccagacga tgtgagccta ctcagtcctg gggaagacgt 4740 

gctcatcgac attgatgaca aggagcccct catccccatc caggaggaag agttcatcga 4800 

ttggtggagc aaattctttg cctccatagg ggagagggaa aagtgcggct cctacctgga 4860 

gaaggatttt gacaccctga aggtctatga cacacagctg gagaatgtgg aggcctttga 4920 

gggcctgtct gacttttgta acaccttcaa gctgtaccgg ggcaagacgc aggaggagac 4980 

agaagatcca tctgtgattg gtgaatttaa gggcctcttc aaaatttatc ccctcccaga 5040 

agacccagcc atccccatgc ccccaagaca gttccaccag ctggccgccc agggacccca 5100 

ggagtgcttg gtccgtatct acattgtccg agcatttggc ctgcagccca aggaccccaa 5160 

tggaaagtgt gatccttaca tcaagatctc catagggaag aaatcagtga gtgaccagga 5220 

taactacatc ccctgcacgc tggagcccgt atttggaaag atgttcgagc tgacctgcac 5280 

tctgcctctg gagaaggacc taaagatcac tctctatgac tatgacctcc tctccaagga 5340 

cgaaaagatc ggtgagacgg tcgtcgacct ggagaacagg ctgctgtcca agtttggggc 5400 

tcgctgtgga ctcccacaga cctactgtgt ctctggaccg aaccagtggc gggaccagct 5460 
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ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 



cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 



acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 



15/68 

ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttg 



agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 



cacctgtgta 
-tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 



5520 
5580 
5640 
5700 
5760 
5820 
5880 
5915 



<210> 4 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 4 
tgggacctca aagggcatcc 

<210> 5 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 5 
accatgctgt aggatgtgga 

<210> 6 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 6 
gggaggtgaa gcaacttcaa 

<210> 7 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 7 
ctcacggggt agaagatgag 

<210> 8 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 8 
cagggccgag atgagcccaa 

<210> 9 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 9 
acatcaaggg tcctggatga 

<210> 10 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 10 
ctgtggcggt gtttccggtg 

<210> 11 
<211> 20 



20 



20 



20 



20 



20 



20 



20 
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<212> DNA 

<213> Homo sapiens 

<400> 11 
acagacgtgc gttatcgttc 

<210> 12 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 12 
aagactgagc aaaatcccag 

<210> 13 

<211> 6912 

<212> DNA 

<213> Homo sapiens 



<400> 13 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga 
agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg 
gcatccccct ggaccagggc tctgagcttc atgtggtggt caaagaccat 
ggaggaacag gttcctgggg gaagccaagg tcccactccg agaggtcctc 
gtctgtccgc cagcttcaat gcccccctgc tggacaccaa gaagcagccc 
cgctggtcct gcaggtgtcc tacacaccgc tgcctggagc tgtgcccctg 
ctactcctct ggagccctcc ccgactctgc ctgacctgga tgtagtggca 
gagaggaaga cacagaggac cagggactca ctggagatga ggcggagcca 
aaagcggagg cccgggggct cccaccaccc caaggaaact accttcacgt 
actaccccgg gatcaaaaga aagcgaagtg cgcctacatc tagaaagctg 
aaccgcagga tttccagatc agggtccagg tgatcgaggg gcgccagctg 
acatcaagcc tgtggtcaag gttaccgctg cagggcagac caagcggacg 
agggaaacag cccactcttc aatgagactc ttttcttcaa cttgtttgac 
agctgtttga tgagcccatc tttatcacgg tggtagactc tcgttctctc 
ctctcctcgg ggagttccgg atggacgtgg gcaccattta cagagagccc 
atctcaggaa gtggctgctg ctctcagacc ctgatgactt ctctgctggg 
acctgaaaac aagcctttgt gtgctggggc ctggggacga agcgcctctg 
acccctctga agacaaggag gacattgaaa gcaacctgct ccggcccaca 
tgcgaggagc ccacttctgc ctgaaggtct tccgggccga ggacttgccg 
atgccgtgat ggacaacgtg aaacagatct ttggcttcga gagtaacaag 
tggacccctt tgtggaggtc agctttgcgg ggaaaatgct gtgcagcaag 
agacggccaa ccctcagtgg aaccagaaca tcacactgcc tgccatgttt 
gcgaaaaaat gaggattcgt atcatagact gggaccgcct gactcacaat 
ctaccaccta cctgagtatg tcgaaaatct ctgcccctgg aggagaaata 
ctgcaggtgc tgtcaagcct tcgaaagcct cagacttgga tgactacctg 
ccacttttgg gccctgctac atcaacctct atggcagtcc cagagagttc 
cagaccccta cacagagctc aacacaggca agggggaagg tgtggcttat 
ttctgctctc cctggagacc aagctggtgg agcacagtga acagaaggtg 
ctgcggatga catcctccgg gtggagaagt accttaggag gcgcaagtac 
cggccttcta ctcagccacc atgctgtagg atgtggatga tgccatccag 
gcatcgggaa ctacgggaac aagttcgaca tgacctgcct gccgctggcc 
agtacagccg tgcagtcttt gacgggtgcc actactacta cctaccctgg 
aacctgtggt ggtgctgtca tcctactggg aggacatcag ccatagaatc 
accagctgct tgggattgct gaccggctgg aagctggcct ggagcaggtc 
tgaaggcgca gtgctccacg gaggacgtgg actcgctggt ggctcagctg 
tcatcgcagg ctgcagccag cctctgggtg acatccatga gacaccctct 
tggaccagta cctgtaccag ctgcgcaccc atcacctgag ccaaatcact 
tggccctgaa gctcggccac agtgagctcc ctgcagctct ggagcaggcg 
-tcctgcgtct gcgtgccctg gcagaggagc cccagaacag cctgccggac 
ggatgctgca gggagacaag cgtgtggcat accagcgggt gcccgcccac 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaagg 

gagacgatgg 

gccaccccta 

acaggggcct 

ttcccgcccc 

gacacaggag 

ttcctggatc 

cctccgcccc 

ctgtcagaca 

ccgggggtga 

cggatccaca 

tctcctgggg 

aggacagatg 

cggcacgcct 

gccagaggct 

gagagaaaag 

ggcgtagccc 

cagatggacg 

aagaacttgg 

atcttggaga 

ccctccatgt 

gacatcgtgg 

gaagaggagc 

ggcttcctcc 

acaggcttcc 

cgtggccggc 

gaggaccttc 

tccctgtttg 

tttgaggtca 

tccaccactc 

ggtaacgtga 

gagactcaga 

cacctggccc 

acggatgagc 

gccacccacc 

gaggctgccc 

gaggactggc 

atcgtcatct 

caagtcctct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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tctcccggcg 
tgaaatatcc 
tgtggtttgg 
ctgtctttgc 
caacgggcct 
acagct-tccg 
agactctgct 
accagacccg 
acggggagaa 
atgaggaatg 
tcaccatccc 



cacaccgacg 
tgaaaaggca 
gctggaagtt 
gtcgcatgga 
tgggcggcgt 
gtgtgaacag 
gctacatgta 
atgccatcgt 
accccacctg 
cagttgctga 
cagacgagtt 
ccrtggttccc 
tcatccagag 
caaggatcct 
ccaacatcta 
tcctggcatg 
gcctcgtggt 
agaaccccaa 
tctactgccc 
tggtgggcca 
agagtccatc 
tgctcatcga 
attggtggag 
agaaggattt 
agggcctgtc 
cagaagatcc 
aagacccagc 
aggagtgctt 
atggaaagtg 
ataactacat 
ctctgcctct 
acgaaaagat 
ctcgctgtgg 
tccgcccctc 
accggacaga 
gcaggatccc 
agcagcaggg 
cagacatcga 
ggcctggacc 
ttatctggaa 
gcgacattta 
tgcattatcg 
actacctgcc 
acaagactga 
tctcctttga 
ccaagacagc 
ttgtgtccct 
gtgagaagaa 
agcatgagga 
aggacccaag 
agttcatcct 
tgctgctgtt 
tgaagccctt 
catgggactg 
gattgtcctg 
aacatggagc 



gggtgccaac 
gatggagaag 
gctctctgtg 
tgaaacctat 
cacctacccc 
cccctcggcc 
ccatgacatg 
gcttcccgga 
ggtgcttccc 
gtccacagac 
cccggagcgg 
gcggcgctgg 
caggcaggcg 
ccacctcgag 
gccactggag 
gatggatgac 
acccacgatt 
ccaggcccgg 
ctccttcctg 
ggaccagacg 
gcaaccgccc 
tatgggtcgc 
actgacgagg 
agagaagccg 
ggatgagtct 
catggttcct 
gggcctgcgg 
agagtgtggg 
ctttgacatc 
ccccatcacc 
gtgtaccatc 
cccacagggt 
cattgatgac 
caaattcttt 
tgacaccctg 
tgacttttgt 
atctgtgatt 
catccccatg 
ggtccgtatc 
tgatccttac 
cccctgcacg 
ggagaaggac 
cggtgagacg 
actcccacag 
ccagctcctc 
ccgtgtaatg 
aaacccacac 
cctggtcccg 
gcaggggaag 
tcccttcaac 
taccagagat 
tgtgaaaggt 
ttccctggga 
agctgagcaa 
gagcaaaatc 
tgattttctg 
caagaagtgc 
ttttgagcag 
aatactggcg 
gcggcctgct 
gcgccccgac 
gtggcggcgt 
cctggccatc 
cagctgagga 
gcctgcctcc 
ccagggtggg 
tctgagatca 



tactgtggca 
gtgcctggcg 
gatgagaagg 
gagaacgaga 
aagttttctg 
ggctggacct 
gacgccggtc 
ggccagtgga 
aaggatgaca 
ctcaaccggg 
aagccgaagc 
gtgcgcctgc 
gaggcggagg 
taccgcaaga 
aagacggggc 
aagagtgaag 
tcctgcatat 
gacctggctg 
caccagagcc 
ctcatcttct 
agcattgtgg 
tgcatctgtc 
ggcagccagc 
gccatccacc 
gaggacacag 
cagaacatca 
aacatgaaga 
ggccagacgg 
tgcaccctct 
gtcaaggtca 
cgctccctgg 
ggcccagacg 
aaggagcccc 
gcctccatag 
aaggtctatg 
aacaccttca 
ggtgaattta 
cccccaagac 
tacattgtcc 
atcaagatct 
ctggagcccg 
ctaaagatca 
gtcgtcgacc 
acctactgtg 
cacctcttct 
tttcaggata 
ctgggcccag 
gagcacgtgg 
ctgcagatgt 
atcaccccac 
gtgatcctgg 
tggatgattg 
ggtgaaggca 
gtctgtacca 
ccagcacgag 
ggctccctgc 
tccttggacc 
aaaacagtga 
ggcaagctgg 
ggccagggcc 
acctccttcc 
ttccggtggg 
ttcatctacg 
ctctcctgcc 
tccgcccagc 
cagacagaca 
ccccacttcc 



agaattgtgg 
cccggatgcc 
agttcaacca 
ctaagttggc 
acgtcacggg 
gggctggaga 
acctgagctt 
tctacatgag 
ttgagtgccc 
ctgtcgatga 
actgggtccc 
gcaggaggga 
gcgagggctg 
cagatgcctt 
ctgcagctgt 
attccatgtc 
tcgactatgg 
cgatggacaa 
agaagacggt 
acgagatcga 
tggagctgta 
aaccgagtct 
cgtcggggga 
atattcctgg 
acctgcccta 
agccagcgct 
gttaccagct 
tgcagtcctg 
tcatggaagt 
tcgataaccg 
agagcttcct 
atgtgagcct 
tcatccccat 
gggagaggga 
acacacagct 
agctgtaccg 
agggcctctt 
agttccacca 
gagcatttgg 
ccatagggaa 
tatttggaaa 
ctctctatga 
tggagaacag 
tctctggacc 
gccagcagca 
aagaatattc 
tggaggagcg 
agtcacggcc 
gggtcgacct 
ggagagccag 
atgacctgag 
gctttgaaga 
acttcaactg 
ttgccaagaa 
tggtgttcca 
agctcgatct 
agctggatga 
agggctggtg 
aaatgacctt 
gggatgagcc 
tgtggtttac 
ccatcatcct 
ccttcccgaa 
ctgtagaagg 
tcggcgagct 
gatggaccgg 
atcatttcct 



gaagctacag 
agtgcagata 
gtttgctgag 
ccttgttggg 
caagatcaag 
ttggttcgtg 
cgtggaagag 
tgacaactac 
actgggctgg 
gcaaggctgg 
tgctgagaag 
tctcagccaa 
ggagtacgcc 
ccgccgccgc 
gtttgccctt 
cgtctccacc 
gaaccgctac 
ggactctttt 
ggtggtgaag 
gatctttggc 
cgaccatgac 
ggaacggatg 
gctgctggcc 
ttttgaggtg 
cccaccaccc 
ccagcgtacc 
ggccaacatc 
tgtcatcagg 
gatgctgccc 
ccagtttggc 
gtgtgacccc 
actcagtcct 
ccaggaggaa 
aaagtgcggc 
ggagaatgtg 
gggcaagacg 
caaaatttat 
gctggccgcc 
cctgcagccc 
gaaatcagtg 
gatgttcgag 
ctatgacctc 
gctgctgtcc 
gaaccagtgg 
tagagtcaag 
cattgaagag 
tctggctctg 
cctctacagc 
atttccgaag 
aaggtttttc 
cctcacgggg 
acacaagcaa 
gaggttcatt 
ggatgccttc 
gatctgggac 
caaccgcatg 
tgctttccac 
gccctgtgta 
ggagattgta 
caacatgaac 
ctccccatac 
cttcatcatc 
ctatgctgcc 
ggccgtgggg 
cctccagacc 
cccacactcc 
tctcccccaa 



acaatctttc 
cgggtcaagc 
gggaagctgt 
aactggggca 
ctacccaagg 
tgtccggaga 
gtgtttgaga 
accgatgtga 
aagtgggaag 
gagtatagca 
atgtactaca 
atggaagcac 
tctctttttg 
cgctggcgcc 
gagggggccc 
ttgagcttcg 
catctacgct 
tctgatccct 
aacaccctta 
gagccggcca 
acttatggtg 
ccacggctgg 
tcttttgagc 
caggagacat 
cagagggagg 
gccatcgaga 
tcctccccca 
aacctccgga 
agggaggagc 
cgccggcctg 
tactcggcgg 
ggggaagacg 
gagttcatcg 
tcctacctgg 
gaggcctttg 
caggaggaga 
cccctcccag 
cagggacccc 
aaggacccca 
agtgaccagg 
ctgacctgca 
ctctccaagg 
aagtttgggg 
cgggaccagc 
gcacctgtgt 
atagaggctg 
catgtgcttc 
cccctgcagc 
gccctggggc 
ctgcgttgta 
gagaagatga 
aagacagacg 
ttccccttcg 
tggaggctgg 
aatgacaagt 
cccaagccag 
ccagaatggt 
gcagaagagg 
gcagagagtg 
cctaagcttg 
aagaccatga 
ctcttcatcc 
atgaagctgg 
tcccctccag 
tcctaggcct 
cagagttgct 
cccaacgctt 



2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 
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ttttggatca gctcagacat atttcagtat aaaacagttg gaaccacaaa aaaaaaaaaa 6900 



aaaaaaaaaa aa 



6912 



<210> 14 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 14 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 60 

agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 

tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

•tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 

cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtttgc 2160 

ggccttctac tcagccacca tgctgtagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgctgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatctttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

cggggagaag gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 
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acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 
ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 
gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
ggatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacct.cc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 
aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



tggaagcact 
ctctttttgg 
gctggcgccg 
agggggccct 
tgagcttcgg 
atctacgctg 
ctgatcccta 
acacccttaa 



agccggccac 
cttatggtgc 
cacggctggc 
cttttgagct 
aggagacatc 
agagggaggc 
ccatcgagat 
cctcccccag 
acctccggaa 
gggaggagct 
gccggcctgt 
actcggcgga 
gggaagacgt 
agttcatcga 
cctaqctgga 
aggcctttga 
aggaggagac 
ccctcccaga 
agggacccca 
aggaccccaa 
gtgaccagga 
tgacctgcac 
tctccaagga 
agtttggggc 
gggaccagct 
cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6911 



<210> 15 

<211> 6910 

<212> DNA 

<213> Homo sapiens 

<400> 15 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 



60 
120 
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tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcgcctcgg ccctcccgac ctttccgagc cctcrtttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 

caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga agagaaccaa 480 

agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg acctcaaggg 540 

catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg agacgatggg 600 

gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg ccacccctag 660 

tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca caggggcctc 720 

gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt tcccgccccc 780 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 

cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtttgc 2160 

ggccttctac tcagccacca tgctgcagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgctgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatctttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

c gg99 a 9 aa 9 gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 

acaccgacgg cggcgctggg tgcgcctgcg caggagggat ctcagccaaa tggaagcact 3540 

gaaaaggcac aggcaggcgg aggcggaggg cgagggctgg gagtacgcct ctctttttgg 3600 

ctggaagttc cacctcgagt accgcaagac agatgccttc cgccgccgcc gctggcgccg 3660 

tcgcatggag ccactggaga agacggggcc tgcagctgtg tttgcccttg agggggccct 3720 

gggcggcgtg atggatgaca agagtgaaga ttccatgtcc gtctccacct tgagcttcgg 3780 

tgtgaacaga cccacgattt cctgcatatt cgactatggg aaccgctacc atctacgctg 3840 

ctacatgtac caggcccggg acctggctgc gatggacaag gactcttttt ctgatcccta 3900 

tgccatcgtc tccttcctgc accagagcca gaagacggtg gtggtgaaga acacccttaa 3960 

ccccacctgg gaccagacgc tcatcttcta cgagatcgag atctttggcg agccggccac 4020 

agttgctgag caaccgccca gcattgtggt ggagctgtac gaccatgaca cttatggtgc 4080 

agacgagttt atgggtcgct gcatctgtca accgagtctg gaacggatgc cacggctggc 4140 
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crtggttccca 

catccagaga 

aaggatcctg 

caacatctac 

cctggcatgg 

cctcgtggta 

gaaccccaac 

ctactgcccc 

ggtgggccag 

gagtccatcc 

gctcatcgac 

ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcattatcgt 

tacctgccag 

aagactgaga 

tcctttgatg 

aagacagcca 

gtgtcccttt 

gagaagaaaa 

catgaggagc 

gacccaaggc 

ttcatcctgt 

ctgctgttcc 

aagcccttca 

tgggactggc 

ttgtcctgcc 

catggagctc 

ttggatcagc 

aaaaaaaaaa 



ctgacgaggg 

gagaagccgg 

gatgagtctg 

atggttcctc 

ggcctgcgga 

gagtgtgggg 

tttgacatct 

cccatcaccg 

tgtaccatcc 

ccacagggtg 

attgatgaca 

aaattctttg 

gacaccctga 

gacttttgta 

tctgtgattg 

atccccatgc 

gtccgtatct 

gatccttaca 

ccctgcacgc 

gagaaggacc 

ggtgagacgg 

ctcccacaga 

cagctcctcc 

cgtgtaatgt 

aacccacacc 

ctggtcccgg 

caggggaagc 

cccttcaaca 

accagagatg 

gtgaaaggtt 

tccctgggag 

ctgagcaagt 

gcaaaatccc 

attttctggg 

agaagtgctc 

ttgagcagaa 

tactggcggg 

ggcctgctgg 

gccccgacac 

ggcggcgttt 

tggccatctt 

gctgaggact 

ctgcctcctc 

agggtgggca 

tgagatcacc 

tcagacatat 



gcagccagcc 

ccatccacca 

aggacacaga 

agaacatcaa 

acatgaagag 

gccagacggt 

gcaccctctt 

tcaaggtcat 

gctccctgga 

gcccagacga 

aggagcccct 

cctccatagg 

aggtctatga 

acaccttcaa 

gtgaatttaa 

ccccaagaca 

acattgtccg 

tcaagatctc 

tggagcccgt 

taaagatcac 

tcgtcgacct 

cctactgtgt 

acctcttctg 

ttcaggataa 

tgggcccagt 

agcacgtgga 

tgcagatgtg 

tcaccccacg 

tgatcctgga 

ggatgattgg 

gtgaagcaac 

ctgtaccatt 

agcacgagtg 

ctccctgcag 

cttggaccag 

aacagtgaag 

caagctggaa 

ccagggccgg 

ctccttcctg 

ccggtgggcc 

catctacgcc 

ctcctgccct 

cgcccagctc 

gacagacaga 

ccacttccat 

ttcagtataa 



gtcgggggag 

tattcctggt 

cctgccctac 

gccagcgctc 

ttaccagctg 

gcagtcctgt 

catggaagtg 

cgataaccgc 

gagcttcctg 

tgtgagccta 

catccccatc 

ggagagggaa 

cacacagctg 

gctgtaccgg 

gggcctcttc 

gttccaccag 

agcatttggc 

catagggaag 

atttggaaag 

tctctatgac 

ggagaacagg 

ctctggaccg 

ccagcagcat 

agaatattcc 

ggaggagcgt 

gtcacggccc 

ggtcgaccta 

gagagccaga 

tgacctgagc 

ctttgaagaa 

ttcaactgga 

gccaagaagg 

gtgttccaga 

ctcgatctca 

ctggatgatg 

ggctggtggc 

atgaccttgg 

gatgagccca 

tggtttacct 

atcatcctct 

ttcccgaact 

gtagaagggg 

ggcgagctcc 

tggaccggcc 

catttccttc 

aacagttgga 



ctgctggcct 

tttgaggtgc 

ccaccacccc 

cagcgtaccg 

gccaacatct 

gtcatcagga 

atgctgccca 

cagtttggcc 

tgtgacccct 

ctcagtcctg 

caggaggaag 

aagtgcggct 

gagaatgtgg 

ggcaagacgc 

aaaatttatc 

ctggccgccc 

ctgcagccca 

aaatcagtga 

atgttcgagc 

tatgacctcc 

ctgctgtcca 

aaccagtggc 

agagtcaagg 

attgaagaga 

ctggctctgc 

ctctacagcc 

tttccgaagg 

aggtttttcc 

ctcacggggg 

cacaagcaaa 

ggttcatttt 

atgccttctg 

tctgggacaa 

accgcatgcc 

ctttccaccc 

cctgtgtagc 

agattgtagc 

acatgaaccc 

ccccatacaa 

tcatcatcct 

atgctgccat 

ccgtggggtc 

tccagacctc 

cacactccca 

tcccccaacc 

accacaaaaa 



cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

ccccttcgac 

gaggctggac 

tgacaagttc 

caagccagcc 

agaatggttt 

agaagagggt 

agagagtgag 

taagcttgag 

gaccatgaag 

cttcatcctg 

gaagctggtg 

ccctccagca 

ctaggcctga 

gagttgctaa 

caacgctttt 

aaaaaaaaaa 



<210> 16 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 16 

tcgaccgccc agccaggtgc aaaatgccgt 
agattacagc tcgacggagc tcgggaaggg 
tgttctcgga acgccggctg acaagcgggg 
gcccactgga gcagccgggg gtggcccgtt 
agccagagat tcgagccggc ctcgcccagc 
ggcgcctcgg ccctcccgac ctttccgagc 
acacgcgcca agcatgctga gggtcttcat 
caccgacatc agcgatgcct actgctccgc 
agtcatcaag aacagcgtga accctgtatg 
catccccctg gaccagggct ctgagcttca 
gaggaacagg ttcctggggg aagccaaggt 
tctgtccgcc agcttcaatg cccccctgct 
gctggtcctg caggtgtcct. acacaccgct 



gtcattggga 
cggcgggggt 
tgagcgcagg 
cccctttaag 
cagccctctc 
cctctttgcg 
cctctatgcc 
ggtgtttgca 
gaatgaggga 
tgtggtggtc 
cccactccga 
ggacaccaag 
gcctggagct 



gactccgcag 
ggaagatgag 
cggggcgggg 
agcaactgct 
cagcgagggg 
ccctgggcgc 
gagaacgtcc 
ggggtgaaga 
tttgaatggg 
aaagaccatg 
gaggtcctcg 
aagcagccca 
gtgcccctgt 



ccggagcatt 
cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggcccrt 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 



4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6910 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



BNSDOCID: <WO_ 



_0011157A1_IA> 



WO 00/11157 



PCT/US99/19395 



22/68 

tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag acacaggagg 840 

agaggaagac acagaggacc agggactcac tggagatgag gcggagccat tcctggatca 900 

aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc ctccgcccca 960 

ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc tgtcagacaa 1020 

accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc cgggggtgaa 1080 

catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc ggatccacaa 1140 

gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact ctcctgggga 1200 

gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca ggacagatgc 1260 

tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc ggcacgccta 1320 

tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg ccagaggcta 1380 

cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg agagaaaaga 1440 

cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag gcgtagccct 1500 

gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc agatggacga 1560 

tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga agaacttggt 1620 

ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga tcttggagaa 1680 

gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc cctccatgtg 1740 

cgaaaaaatg aggattcgta tcatagactg ggaccgcctg actcacaatg acatcgtggc 1800 

taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag aagaggagcc 1860 

tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg gcttcctccc 1920 

cacttrttggg ccctgctaca tcaacctcta tggcagtccc agagagttca caggcttccc 1980 

agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc gtggccggct 2040 

tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg aggaccttcc 2100 

tgcggatgac atcctccggg tggagaagta ccttaggagg cgcaagtact ccctgtjttgc 2160 

ggccttctac tcagccacca tgctgcagga tgtggatgat gccatccagt ttgaggtcag 2220 

catcgggaac tacgggaaca agttcgacat gacctgcctg ccgctggcct ccaccactca 2280 

gtacagccgt gcagtctttg acgggtgcca ctactactac ctaccctggg gtaacgtgaa 2340 

acctgtggtg gtgctgtcat cctactggga ggacatcagc catagaatcg agactcagaa 2400 

ccagctgctt gggattgctg accggctgga agctggcctg gagcaggtcc acctggccct 2460 

gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg gctcagctga cggatgagct 2 520 

catcgcaggc tgcagccagc ctctgggtga catccatgag acaccctctg ccacccacct 2580 

ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc caaatcactg aggctgccct 2640 

ggccctgaag ctcggccaca gtgagctccc tgcagctctg gagcaggcgg aggactggct 2700 

cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc ctgccggaca tcgtcatctg 2760 

gatgctgcag ggagacaagc gtgtggcata ccagcgggtg cccgcccacc aagtcctctt 2820 

ctcccggcgg ggtgccaact actgtggcaa gaattgtggg aagctacaga caatctttct 2880 

gaaatatccg atggagaagg tgcctggcgc ccggatgcca gtgcagatac gggtcaagct 2940 

gtggtttggg ctctctgtgg atgagaagga gttcaaccag tttgctgagg ggaagctgtc 3000 

tgtctttgct gaaacctatg agaacgagac taagttggcc cttgttggga actggggcac 3060 

aacgggcctc acctacccca agttttctga cgtcacgggc aagatcaagc tacccaagga 3120 

cagcttccgc ccctcggccg gctggacctg ggctggagat tggttcgtgt gtccggagaa 3180 

gactctgctc catgacatgg acgccggtca cctgagcttc gtggaagagg tgtttgagaa 3240 

ccagacccgg cttcccggag gccagtggat ctacatgagt gacaactaca ccgatgtgaa 3300 

cggggagaag gtgcttccca aggatgacat tgagtgccca ctgggctgga agtgggaaga 3360 

tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag caaggctggg agtatagcat 3420 

caccatcccc ccggagcgga agccgaagca ctgggtccct gctgagaaga tgtactacac 3480 

acaccgacgg cggcgctggg tgcgcctgcg caggagggat ctcagccaaa tggaagcact 3540 

gaaaaggcac aggcaggcgg aggcggaggg cgagggctgg gagtacgcct ctctttzttgg 3600 

ctggaagttc cacctcgagt accgcaagac agatgccttc cgccgccgcc gctggcgccg 3660 

tcgcatggag ccactggaga agacggggcc tgcagctgtg tttgcccttg agggggccct 3720 

gggcggcgtg atggatgaca agagtgaaga ttccatgtcc gtctccacct tgagcttcgg 3780 

tgtgaacaga cccacgattt cctgcatatt cgactatggg aaccgctacc atctacgctg 3840 

ctacatgtac caggcccggg acctggctgc gatggacaag gactcttttt ctgatcccta 3900 

tgccatcgtc tccttcctgc accagagcca gaagacggtg gtggtgaaga acacccttaa 3960 

ccccacctgg gaccagacgc tcatcttcta cgagatcgag atctttggcg agccggccac 4020 

agttgctgag caaccgccca gcattgtggt ggagctgtac gaccatgaca cttatggtgc 4080 

agacgagttt atgggtcgct gcatctgtca accgagtctg gaacggatgc cacggctggc 4140 

ctggttccca ctgacgaggg gcagccagcc gtcgggggag ctgctggcct cttttgagct 4200 

catccagaga gagaagccgg ccatccacca tattcctggt tttgaggtgc aggagacatc 4260 

aaggatcctg gatgagtctg aggacacaga cctgccctac ccaccacccc agagggaggc 4320 

caacatctac atggttcctc agaacatcaa gccagcgctc cagcgtaccg ccatcgagat 4380 

cctggcatgg ggcctgcgga acatgaagag ttaccagctg gccaacatct cctcccccag 4440 

cctcgtggta gagtgtgggg gccagacggt gcagtcctgt gtcatcagga acctccggaa 4500 

gaaccccaac tttgacatct gcaccctctt catggaagtg atgctgccca gggaggagct 4560 

ctactgcccc cccatcaccg tcaaggtcat cgataaccgc cagtttggcc gccggcctgt 4620 

ggtgggccag tgtaccatcc gctccctgga gagcttcctg tgtgacccct actcggcgga 4680 

gagtccatcc ccacagggtg gcccagacga tgtgagccta ctcagtcctg gggaagacgt 4740 

gctcatcgac attgatgaca aggagcccct catccccatc caggaggaag agttcatcga 4800 
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ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcattatcgt 

ctacctgcca 

caagactgag 

ctcctttgat 

caagacagcc 

tgtgtccctt 

tgagaagaaa 

gcatgaggag 

ggacccaagg 

gttcatcctg 

gctgctgttc 

gaagcccttc 

atgggactgg 

attgtcctgc 

acatggagct 

tttggatcag 

aaaaaaaaaa 



aaattctttg 

gacaccctga 

gacttttgta 

tctgtgattg 

atccccatgc 

gtccgtatct 

gatccttaca 

ccctgcacgc 

gagaaggacc 

ggtgagacgg 

ctcccacaga 

cagctcctcc 

cgtgtaatgt 

aacccacacc 

ctggtcccgg 

caggggaagc 

cccttcaaca 

accagagatg 

gtgaaaggtt 

tccctgggag 

gctgagcaag 

agcaaaatcc 

gattttctgg 

aagaagtgct 

tttgagcaga 

atactggcgg 

cggcctgctg 

cgccccgaca 

tggcggcgtt 

ctggccatct 

agctgaggac 

cctgcctcct 

cagggtgggc 

ctgagatcac 

ctcagacata 



cctccatagg 

aggtctatga 

acaccttcaa 

gtgaatttaa 

ccccaagaca 

acattgtccg 

tcaagatctc 

tggagcccgt 

taaagatcac 

tcgtcgacct 

cctactgtgt 

acctcttctg 

ttcaggataa 

tgggcccagt 

agcacgtgga 

tgcagatgtg 

tcaccccacg 

tgatcctgga 

ggatgattgg 

gtgaaggcaa 

tctgtaccat 

cagcacgagt 

gctccctgca 

ccttggacca 

aaacagtgaa 

gcaagctgga 

gccagggccg 

cctccttcct 

tccggtgggc 

tcatctacgc 

tctcctgccc 

ccgcccagct 

agacagacag 

cccacttcca 

tttcagtata 



ggagagggaa 

cacacagctg 

gctgtaccgg 

gggcctcttc 

gttccaccag 

agcatttggc 

catagggaag 

atttggaaag 

tctctatgac 

ggagaacagg 

ctctggaccg 

ccagcagcat 

agaatattcc 

ggaggagcgt 

gtcacggccc 

ggtcgaccta 

gagagccaga 

tgacctgagc 

ctttgaagaa 

cttcaactgg 

tgccaagaag 

ggtgttccag 

gctcgatctc 

gctggatgat 

gggctggtgg 

aatgaccttg 

ggatgagccc 

gtggtttacc 

catcatcctc 

cttcccgaac 

tgtagaaggg 

cggcgagctc 

atggaccggc 

tcatttcctt 

aaacagttgg 



aagtgcggct 

gagaatgtgg 

ggcaagacgc 

aaaatttatc 

ctggccgccc 

ctgcagccca 

aaatcagtga 

atgttcgagc 

tatgacctcc 

ctgctgtcca 

aaccagtggc 

agagtcaagg 

attgaagaga 

ctggctctgc 

ctctacagcc 

tttccgaagg 

aggtttttcc 

ctcacggggt 

cacaagcaaa 

aggttcattt 

gatgccttct 

atctgggaca 

aaccgcatgc 

gctttccacc 

ccctgtgtag 

gagattgtag 

aacatgaacc 

tccccataca 

ttcatcatcc 

tatgctgcca 

gccgtggggt 

ctccagacct 

ccacactccc 

ctcccccaac 

aaccacaaaa 



cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

tccccttcga 

ggaggctgga 

atgacaagtt 

ccaagccagc 

cagaatggtt 

cagaagaggg 

cagagagtga 

ctaagcttga 

agaccatgaa 

tcttcatcct 

tgaagctggt 

cccctccagc 

cctaggcctg 

agagttgcta 

ccaacgcttt 

aaaaaaaaaa 



<210> 17 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 17 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg 
qqcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga 
agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg 
catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg 
gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg 
tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca 
gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt 
tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag 
agaggaagac acagaggacc agggactcac tggagatgag gcggagccat 
aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc 
ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc 
accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc 
catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc 
gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact 
gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca 
tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc 
tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg 
cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaggg 

agacgatggg 

ccacccctag 

caggggcctc 

tcccgccccc 

acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 



4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6911 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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cccctctgaa 

gcgaggagcc 

tgccgtgatg 

ggaccccttt 

gacggccaac 

cgaaaaaatg 

taccacctac 

tgcaggtgct 

cacttttggg 

agacccctac 

tctgctctcc 

tgcggatgac 

ggccrttctac 

catcgggaac 

gtacagccgt 

acctgtggtg 

ccagctgctt 

gaaggcgcag 

catcgcaggc 

ggaccagtac 

ggccctgaag 

cctgcgtctg 

gatgctgcag 

ctcccggcgg 

gaaatatccg 

gtggtttggg 

tgtctttgct 

aacgggcctc 

cagcttccgc 

gactctgctc 

ccagacccgg 

cggggagaag 

tgaggaatgg 

caccatcccc 



acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agt-tgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aaggatcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 
ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 



gacaaggagg 
cacttctgcc 
gacaacgtga 
gtggaggtca 
ccrtcagtgga 
aggattcgta 
ctgagfcatgt 
gtcaagcctt 
ccctgctaca 
acagagctca 
ctggagacca 
atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tcrtgtgattg 
atccccatgc 
gtccgtatct 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 



acattgaaag 
tgaaggtctt 
aacagatctt 
gctttgcggg 
accagaacat 
tcatagactg 
cgaaaatctc 
cgaaagcctc 
tcaacctcta 
acacaggcaa 
agctggtgga 
tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 



24/68 

caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 
gggggaaggt 
gcacagtgaa 
ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 



cggcccacag 
gacttgccgc 
agtaacaaga 
tgcagcaaga 
gccatgtttc 
actcacaatg 
ggagaaatag 
gactacctgg 
agagagttca 
gtggcttatc 
cagaaggtgg 
cgcaagtact 
gccatccagt 
ccgctggcct 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 



gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 

ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 

aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 

tgtactacac 

tggaagcact 

ctctttttgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 



1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 



BNSDOCID: <WO 001 1 157A1_IA> 



WO 00/11157 



PCT/US99/19395 



25/68 



ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacatttat 
gcattatcgt 
ctacctgcca 
caagactgag 
ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 
gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 



ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 
gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
agatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 
aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



cacctgtgta 
tagaggctgg 
atgtgcttca 
ccctgcagcc 
ccctggggcg 
tgcgttgtat 
agaagatgag 
agacagacgt 
tccccttcga 
ggaggctgga 
atgacaagtt 
ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



<210> 18 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 18 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag 
agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct 
agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc 
acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca ggggtgaaga 
agtcatcaag aacagcgtga accctgtatg gaatgaggga tttgaatggg 
catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg 
gaggaacagg ttcctggggg aagccaaggt cccactccga gaggtcctcg 
tctgtccgcc agcttcaatg cccccctgct ggacaccaag aagcagccca 
gctggtcctg caggtgtcct acacaccgct gcctggagct gtgcccctgt 
tactcctctg gagccctccc cgactctgcc tgacctggat gtagtggcag 
agaggaagac acagaggacc agggactcac tggagatgag gcggagccat 
aagcggaggc ccgggggctc ccaccacccc aaggaaacta ccttcacgtc 
ctaccccggg atcaaaagaa agcgaagtgc gcctacatct agaaagctgc 
accgcaggat ttccagatca gggtccaggt gatcgagggg cgccagctgc 
catcaagcct gtggtcaagg ttaccgctgc agggcagacc aagcggacgc 
gggaaacagc ccactcttca atgagactct tttcttcaac ttgtttgact 
gctgtttgat gagcccatct ttatcacggt ggtagactct cgttctctca 
tctcctcggg gagttccgga tggacgtggg caccatttac agagagcccc 
tctcaggaag tggctgctgc tctcagaccc tgatgacttc tctgctgggg 
cctgaaaaca agcctttgtg tgctggggcc tggggacgaa gcgcctctgg 
cccctctgaa gacaaggagg acattgaaag caacctgctc cggcccacag 
gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag gacttgccgc 
tgccgtgatg gacaacgtga aacagatctt tggcttcgag agtaacaaga 
ggaccccttt gtggaggtca gctttgcggg gaaaatgctg tgcagcaaga 
gacggccaac cctcagtgga accagaacat cacactgcct gccatgtttc 
cgaaaaaatg aggattcgta tcatagactg ggaccgcctg. actcacaatg 
taccacctac ctgagtatgt cgaaaatctc tgcccctgga ggagaaatag 
tgcaggtgct gtcaagcctt cgaaagcctc agacttggat gactacctgg 
cacttttggg ccctgctaca tcaacctcta tggcagtccc agagagttca 
agacccctac acagagctca acacaggcaa gggggaaggt gtggcttatc 
tctgctctcc ctggagacca agctggtgga gcacagtgaa cagaaggtgg 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaggg 

agacgatggg 

ccacccctag 

caggggcctc 

tcccgccccc 

acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 

gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 



5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6911 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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tgcggatgac 
ggccrttctac 
catcgggaac 
gtacagccgt 
acctgtggtg 
ccagctgctt 
gaaggcgcag 
catcgcaggc 
ggaccagtac 
ggccctgaag 
cctgcgtctg 
gatgctgcag 
ctcccggcgg 
gaaatatccg 
gtggtttggg 
tgtctttgct 
aacgggcctc 
cagcttccgc 
gactctgctc 
ccagacccgg 
cggggagaag 
tgaggaatgg 
caccatcccc 



acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 
agacgagttt 
ctggttccca 
catccagaga 
aagggtcctg 
caacatctac 
cctggcatgg 
cctcgtggta 
gaaccccaac 
ctactgcccc 

ggtgggccag 
gagtccatcc 
gctcatcgac 
ttggtggagc 
gaaggatttt 
gggcctgtct 
agaagatcca 
agacccagcc 
ggagtgcttg 
tggaaagtgt 
taactacatc 
tctgcctctg 
cgaaaagatc 
tcgctgtgga 
ccgcccctcc 
ccggacagac 
caggatccca 
gcagcagggc 
agacatcgag 
gcctggacct 
tatctggaat 
cgacafcttat 
gcattatcgt 
ctacctgcca 
caagactgag 



atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 
atgggtcgct 
ctgacgaggg 
gagaagccgg 
gatgagtctg 
atggttcctc 
ggcctgcgga 
gagtgtgggg 
tttgacatct 
cccatcaccg 
tgtaccatcc 
ccacagggtg 
attgatgaca 
aaattctttg 
gacaccctga 
gacttttgta 
tctgtgattg 
atccccatgc 
gtccgtatct. 
gatccttaca 
ccctgcacgc 
gagaaggacc 
ggtgagacgg 
ctcccacaga 
cagctcctcc 
cgtgtaatgt 
aacccacacc 
ctggtcccgg 
caggggaagc 
cccttcaaca 
accagagatg 
gtgaaaggtt 
tccctgggag 
gctgagcaag 
agcaaaatcc 



tggagaagta 
tgctgcagga 
agttcgacat 
acgggtgcca 
cctactggga 
accggctgga 
aggacgtgga 
ctctgggtga 
tgcgcaccca 
gtgagctccc 
cagaggagcc 
gtgtggcata 
actgtggcaa 
tgcctggcgc 
atgagaagga 
agaacgagac 
agttttctga 
gctggacctg 
acgccggtca 
gccagtggat 
aggatgacat 
tcaaccgggc 
agccgaagca 
tgcgcctgcg 
aggcggaggg 
accgcaagac 
agacggggcc 
agagtgaaga 
cctgcatatt 
acctggctgc 
accagagcca 
tcatcttcta 
gcattgtggt 
gcatctgtca 
gcagccagcc 
ccatccacca 
aggacacaga 
agaacatcaa 
acatgaagag 
gccagacggt 
gcaccctctt 
tcaaggtcat 
gctccctgga 
gcccagacga 
aggagcccct 
cctccatagg 
aggtctatga 
acaccttcaa 
gtgaatttaa 
ccccaagaca 
acattgtccg 
tcaagatctc 
tggagcccgt 
taaagatcac 
tcgtcgacct 
cctactgtgt 
acctcttctg 
ttcaggataa 
tgggcccagt 
agcacgtgga 
tgcagatgtg 
tcaccccacg 
tgatcctgga 
ggatgattgg 
gtgaaggcaa 
tctgtaccat 
cagcacgagt 
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ccttaggagg 
tgtggatgat 
gacctgcctg 
ctactactac 
ggacatcagc 
agctggcctg 
ctcgctggtg 
catccatgag 
tcacctgagc 
tgcagctctg 
ccagaacagc 
ccagcgggtg 
gaattgtggg 
ccggatgcca 
gttcaaccag 
taagttggcc 
cgtcacgggc 
ggctggagat 
cctgagcttc 
ctacatgagt 
tgagtgccca 
tgtcgatgag 
ctgggtccct 
caggagggat 
cgagggctgg 
agatgccttc 
tgcagctgtg 
ttccatgtcc 
cgactatggg 
gatggacaag 
gaagacggtg 
cgagatcgag 
ggagctgtac 
accgagtctg 
gtcgggggag 
tattcctggt 
cctgccctac 
gccagcgctc 
ttaccagctg 
gcagtcctgt 
catggaagtg 
cgataaccgc 
gagcttcctg 
tgtgagccta 
catccccatc 
ggagagggaa 
cacacagctg 
gctgtaccgg 
gggcctcttc 
gttccaccag 
agcatttggc 
catagggaag 
atttggaaag 
tctctatgac 
ggagaacagg 
ctctggaccg 
ccagcagcat 
agaatattcc 
ggaggagcgt 
gtcacggccc 
ggtcgaccta 
gagagccaga 
tgacctgagc 
ctttgaagaa 
cttcaactgg 
tgccaagaag 
ggtgttccag 



cgcaagtact 
gccatccagt 
ccgctggcct: 
ctaccctggg 
catagaatcg 
gagcaggtcc 
gctcagctga 
acaccctctg 
caaatcactg 
gagcaggcgg 
ctgccggaca 
cccgcccacc 
aagctacaga 
gtgcagatac 
tttgctgagg 
cttgttggga 
aagatcaagc 
tggttcgtgt 
gtggaagagg 
gacaactaca 
ctgggctgga 
caaggctggg 
gctgagaaga 
ctcagccaaa 
gagtacgcct 
cgccgccgcc 
tttgcccttg 
gtctccacct 
aaccgctacc 
gactcttttt 
gtggtgaaga 
atctttggcg 
gaccatgaca 
gaacggatgc 
ctgctggcct 
tttgaggtgc 
ccaccacccc 
cagcgtaccg 
gccaacatct 
gtcatcagga 
atgctgccca 
cagtttggcc 
tgtgacccct 
ctcagtcctg 
caggaggaag 
aagtgcggct 
gagaatgtgg 
ggcaagacgc 
aaaatttatc 
ctggccgccc 
ctgcagccca 
aaatcagtga 
atgttcgagc 
tatgacctcc 
ctgctgtcca 
aaccagtggc 
agagtcaagg 
attgaagaga 
ctggctctgc 
ctctacagcc 
tttccgaagg 
aggtttttcc 
ctcacggggg 
cacaagcaaa 
aggttcattt 
gatgccttct 
atctgggaca 



ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 

aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 

tgtactacac 

tggaagcact 

ctctttttgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

tccccttcga 

ggaggctgga 

atgacaagtt 



2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 
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ctcctttgat 
caagacagcc 
tgtgtccctt 
tgagaagaaa 
gcatgaggag 
ggacccaagg 
gttcatcctg 
gctgctgttc 
gaagcccttc 
atgggactgg 
attgtcctgc 
acatggagct 
tttggatcag 
aaaaaaaaaa 



gattttctgg 
aagaagtgct 
tttgagcaga 
atactggcgg 
cggcctgctg 
cgccccgaca 
tggcggcgtt 
ctggccatct 
agctgaggac 
cctgcctcct 
cagggtgggc 
ctgagatcac 
ctcagacata 



gctccctgca 
ccttggacca 
aaacagtgaa 
gcaagctgga 
gccagggccg 
cctccttcct 
tccggtgggc 
tcatctacgc 
tctcctgccc 
ccgcccagct 
agacagacag 
cccacttcca 
tttcagtata 
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gctcgatctc 
gctggatgat 
gggctggtgg 
aatgaccttg 
ggatgagccc 
gtggtttacc 
catcatcctc 
cttcccgaac 
tgtagaaggg 
cggcgagctc 
atggaccggc 
tcatttcctt 
aaacagttgg 



aaccgcatgc 
gctttccacc 
ccctgtgtag 
gagattgtag 
aacatgaacc 
tccccataca 
ttcatcatcc 
tatgctgcca 
gccgtggggt 
ctccagacct 
ccacactccc 
ctcccccaac 
aaccacaaaa 



ccaagccagc 
cagaatggtt 
cagaagaggg 
cagagagtga 
ctaagcttga 
agaccatgaa 
tcttcatcct 
tgaagctggt 
cccctccagc 
cctaggcctg 
agagttgcta 
ccaacgcttt 
aaaaaaaaaa 



6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6911 



<210> 19 

<211> 6911 

<212> DNA 

<213> Homo sapiens 

<400> 19 

tcgaccgccc agccaggtgc aaaatgccgt 
agattacagc tcgacggagc tcgggaaggg 
tgttctcgga acgccggctg acaagcgggg 
gcccactgga gcagccgggg gtggcccgtt 
agccagagat tcgagccggc ctcgcccagc 
ggcgcctcgg ccctcccgac ctttccgagc 
acacgcgcca agcatgctga gggtcttcat 
caccgacatc agcgatgcct actgctccgc 
agtcatcaag aacagcgtga accctgtatg 
catccccctg gaccagggct ctgagcttca 
gaggaacagg ttcctggggg aagccaaggt 
tctgtccgcc agctrtcaatg cccccctgct 
gctggtcctg caggtgtcct acacaccgct 
tactcctctg gagccctccc cgactctgcc 
agaggaagac acagaggacc agggactcac 
aagcggaggc ccgggggctc ccaccacccc 
ctaccccggg atcaaaagaa agcgaagtgc 
accgcaggat ttccagatca gggtccaggt 
catcaagcct gtggtcaagg ttaccgctgc 
gggaaacagc ccactcttca atgagactct 
gctgtttgat gagcccatct ttatcacggt 
tctcctcggg gagttccgga tggacgtggg 
tctcaggaag tggctgctgc tctcagaccc 
cctgaaaaca agcctttgtg tgctggggcc 
cccctctgaa gacaaggagg acattgaaag 
gcgaggagcc cacttctgcc tgaaggtctt 
tgccgtgatg gacaacgtga aacagatctt 
ggaccccttt gtggaggtca gctttgcggg 
gacggccaac cctcagtgga accagaacat 
cgaaaaaatg aggattcgta tcatagactg 
taccacctac ctgagtatgt cgaaaatctc 
tgcaggtgct gtcaagcctt cgaaagcctc 
cacttttggg ccctgctaca tcaacctcta 
agacccctac acagagctca acacaggcaa 
tctgctctcc ctggagacca agctggtgga 
tgcggatgac atcctccggg tggagaagta 
ggccttctac tcagccacca tgctgcagga 
catcgggaac tacgggaaca agttcgacat 
gtacagccgt gcagtctttg acgggtgcca 
acctgtggtg gtgctgtcat cctactggga 
ccagctgctt gggattgctg accggctgga 
gaaggcgcag tgctccacgg aggacgtgga 
catcgcaggc tgcagccagc ctctgggtga 
ggaccagtac ctgtaccagc tgcgcaccca 
ggccctgaag ctcggccaca gtgagctccc 
cctgcgtctg cgtgccctgg cagaggagcc 



gtcattggga 

cggcgggggt 

tgagcgcagg 

cccctttaag 

cagccctctc 

cctctttgcg 

cctctatgcc 

ggtgtttgca 

gaatgaggga 

tgtggtggtc 

cccactccga 

ggacaccaag 

gcctggagct 

tgacctggat 

tggagatgag 

aaggaaacta 

gcctacatct 

gatcgagggg 

agggcagacc 

tttcttcaac 

ggtagactct 

caccatttac 

tgatgacttc 

tggggacgaa 

caacctgctc 

ccgggccgag 

tggcttcgag 

gaaaatgctg 

cacactgcct 

ggaccgcctg 

tgcccctgga 

agacttggat 

tggcagtccc 

gggggaaggt 

gcacagtgaa 

ccttaggagg 

tgtggatgat 

gacctgcctg 

ctactactac 

ggacatcagc 

agctggcctg 

ctcgctggtg 

catccatgag 

tcacctgagc 

tgcagctctg 

ccagaacagc 



gactccgcag 

ggaagatgag 

cggggcgggg 

agcaactgct 

cagcgagggg 

ccctgggcgc 

gagaacgtcc 

ggggtgaaga 

tttgaatggg 

aaagaccatg 

gaggtcctcg 

aagcagccca 

gtgcccctgt 

gtagtggcag 

gcggagccat 

ccttcacgtc 

agaaagctgc 

cgccagctgc 

aagcggacgc 

ttgtttgact 

cgttctctca 

agagagcccc 

tctgctgggg 

gcgcctctgg 

cggcccacag 

gacttgccgc 

agtaacaaga 

-tgcagcaaga 

gccatgtttc 

actcacaatg 

ggagaaatag 

gactacctgg 

agagagttca 

gtggcttatc 

cagaaggtgg 

cgcaagtact 

gccatccagt 

ccgctggcct 

ctaccctggg 

catagaatcg 

gagcaggtcc 

gctcagctga 

acaccctctg 

caaatcactg 

gagcaggcgg 

ctgccggaca 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaggg . 

agacgatggg 

ccacccctag 

caggggcctc 

tcccgccccc 

acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 

gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 

ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
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gatgctgcag 

ctcccggcgg 

gaaatatccg 

gtggtttggg 

tgtctttgct 

aacgggcctc 

cagcttccgc 

gactctgctc 

ccagacccgg 

cggggagaag 

tgaggaatgg 

caccatcccc 

acaccgacgg 

gaaaaggcac 

ctggaagttc 

tcgcatggag 

gggcggcgtg 

tgtgaacaga 

ctacatgtac 

tgccatcgtc 

ccccacctgg 

agttgctgag 

agacgagttt 

ctggttccca 

catccagaga 

aaggatcctg 

caacatctac 

cctggcatgg 

cctcgtggta 

gaaccccaac 

ctactgcccc 

ggtgggccag 

gagtccatcc 

gctcatcgac 

ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcattatcgt 

ctacctgcca 

caagactgag 

ctcctttgat 

caagacagcc 

tgtgtccctt 

tgagaagaaa 

gcatgaggag 

ggacccaagg 

gttcatcctg 

gctgctgttc 

gaagcccttc 

atgggactgg 

attgtcctgc 



ggagacaagc 

ggtgccaact 

atggagaagg 

ctctctgtgg 

gaaacctatg 

acctacccca 

ccctcggccg 

catgacatgg 

cttcccggag 

gtgcttccca 

tccacagacc 

ccggagcgga 

cggcgctggg 

aggcaggcgg 

cacctcgagt 

ccactggaga 

atggatgaca 

cccacgattt 

caggcccggg 

tccttcctgc 

gaccagacgc 

caaccgccca 

atgggtcgct 

ctgacgaggg 

gagaagccgg 

gatgagtctg 

atggttcctc 

ggcctgcgga 

gagtgtgggg 

tttgacatct 

cccatcaccg 

tgtaccatcc 

ccacagggtg 

attgatgaca 

aaattctttg 

gacaccctga 

gacttttgta 

tctgtgattg 

atccccatgc 

gtccgtatct 

gatccttaca 

ccctgcacgc 

gagaaggacc 

ggtgagacgg 

ctcccacaga 

cagctcctcc 

cgtgtaatgt 

aacccacacc 

ctggtcccgg 

caggggaagc 

cccttcaaca 

accagagatg 

gtgaaaggtt 

tccctgggag 

gctgagcaag 

agcaaaatcc 

gattttctgg 

aagaagtgct 

tttgagcaga 

atactggcgg 

cggcctgctg 

cgccccgaca 

tggcggtgtt 

ctggccatct 

agctgaggac 

cctgcctcct 

cagggtgggc 



gtgtggcata 

actgtggcaa 

tgcctggcgc 

atgagaagga 

agaacgagac 

agttttctga 

gctggacctg 

acgccggtca 

gccagtggat 

aggatgacat 

tcaaccgggc 

agccgaagca 

tgcgcctgcg 

aggcggaggg 

accgcaagac 

agacggggcc 

agagtgaaga 

cctgcatatt 

acctggctgc 

accagagcca 

tcatcttcta 

gcattgtggt 

gcatctgtca 

gcagccagcc 

ccatccacca 

aggacacaga 

agaacatcaa 

acatgaagag 

gccagacggt 

gcaccctctt 

tcaaggtcat 

gctccctgga 

gcccagacga 

aggagcccct 

cctccatagg 

aggtctatga 

acaccttcaa 

gtgaatttaa 

ccccaagaca 

acattgtccg 

tcaagatctc 

tggagcccgt 

taaagatcac 

tcgtcgacct 

cctactgtgt 

acctcttctg 

ttcaggataa 

tgggcccagt 

agcacgtgga 

tgcagatgtg 

tcaccccacg 

tgatcctgga 

ggatgattgg 

gtgaaggcaa 

tctgtaccat 

cagcacgagt 

gctccctgca 

ccttggacca 

aaacagtgaa 

gcaagctgga 

gccagggccg 

cctccttcct 

tccggtgggc 

tcatctacgc 

tctcctgccc 

ccgcccagct 

agacagacag 



ccagcgggtg 

gaattgtggg 

ccggatgcca 

gttcaaccag 

taagttggcc 

cgtcacgggc 

ggctggagat 

cctgagcttc 

ctacatgagt 

tgagtgccca 

tgtcgatgag 

ctgggtccct 

caggagggat 

cgagggctgg 

agatgccttc 

tgcagctgtg 

ttccatgtcc 

cgactatggg 

gatggacaag 

gaagacggtg 

cgagatcgag 

ggagctgtac 

accgagtctg 

gtcgggggag 

tattcctggt 

cctgccctac 

gccagcgctc 

ttaccagctg 

gcagtcctgt 

catggaagtg 

cgataaccgc 

gagcttcctg 

tgtgagccta 

catccccatc 

ggagagggaa 

cacacagctg 

gctgtaccgg 

gggcctcttc 

gttccaccag 

agcatttggc 

catagggaag 

atttggaaag 

tctctatgac 

ggagaacagg 

ctctggaccg 

ccagcagcat 

agaatattcc 

ggaggagcgt 

gtcacggccc 

ggtcgaccta 

gagagccaga 

tgacctgagc 

ctttgaagaa 

cttcaactgg 

tgccaagaag 

ggtgttccag 

gctcgatctc 

gctggatgat 

gggctggtgg 

aatgaccttg 

ggatgagccc 

gtggtttacc 

catcatcctc 

cttcccgaac 

tgtagaaggg 

cggcgagctc 

atggaccggc 



cccgcccacc 

aagctacaga 

gtgcagatac 

tttgctgagg 

cttgttggga 

aagatcaagc 

tggttcgtgt 

gtggaagagg 

gacaactaca 

ctgggctgga 

caaggctggg 

gctgagaaga 

ctcagccaaa 

gagtacgcct 

cgccgccgcc 

tttgcccttg 

gtctccacct 

aaccgctacc 

gactcttttt 

gtggtgaaga 

atctttggcg 

gaccatgaca 

gaacggatgc 

ctgctggcct 

tttgaggtgc 

ccaccacccc 

cagcgtaccg 

gccaacatct 

gtcatcagga 

atgctgccca 

cagtttggcc 

tgtgacccct 

ctcagtcctg 

caggaggaag 

aagtgcggct 

gagaatgtgg 

ggcaagacgc 

aaaatttatc 

ctggccgccc 

ctgcagccca 

aaatcagtga 

atgttcgagc 

tatgacctcc 

ctgctgtcca 

aaccagtggc 

agagtcaagg 

attgaagaga 

ctggctctgc 

ctctacagcc 

tttccgaagg 

aggtttttcc 

ctcacggggg 

cacaagcaaa 

aggttcattt 

gatgccttct 

atctgggaca 

aaccgcatgc 

gctttccacc 

ccctgtgtag 

gagattgtag 

aacatgaacc 

tccccataca 

ttcatcatcc 

tatgctgcca 

gccgtggggt 

ctccagacct 

ccacactccc 



aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 

tgtactacac 

tggaagcact 

ctctttttgg 

gct.ggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

tccccttcga 

ggaggctgga 

atgacaagtt 

ccaagccagc 

cagaatggtt 

cagaagaggg 

cagagagtga 

ctaagcttga 

agaccatgaa 

tcttcatcct 

tgaagctggt 

cccctccagc 

cctaggcctg 

agagttgcta 



2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 
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acatqqagct ctgagatcac cccacttcca tcatttcctt ctcccccaac ccaacgcttt 
tttggatcag ctcagacata tttcagtata aaacagttgg aaccacaaaa aaaaaaaaaa 
aaaaaaaaaa a 

<210> 20 

<211> 6911 

<212> DNA 

<213> Homo sapiens 



6840 
6900 
6911 



<400> 20 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga 
agattacagc tcgacggagc tcgggaaggg cggcgggggt 
tgttctcgga acgccggctg acaagcgggg tgagcgcagg 
gcccactgga gcagccgggg gtggcccgtt cccctttaag 
agccagagat tcgagccggc ctcgcccagc cagccctctc 
ggcgcctcgg ccctcccgac ctttccgagc cctctttgcg 
acacgcgcca agcatgctga gggtcttcat cctctatgcc 
caccgacatc agcgatgcct actgctccgc ggtgtttgca 
agtcatcaag aacagcgtga accctgtatg gaatgaggga 
catccccctg gaccagggct ctgagcttca tgtggtggtc 
gaggaacagg ttcctggggg aagccaaggt cccactccga 
tctgtccgcc agcttcaatg cccccctgct ggacaccaag 
gctggtcctg caggtgtcct acacaccgct gcctggagct 
tactcctctg gagccctccc cgactctgcc tgacctggat 
agaggaagac acagaggacc agggactcac tggagatgag 
aagcggaggc ccgggggctc ccaccacccc aaggaaacta 
ctaccccggg atcaaaagaa agcgaagtgc gcctacatct 
accgcaggat ttccagatca gggtccaggt gatcgagggg 
catcaagcct gtggtcaagg ttaccgctgc agggcagacc 
gggaaacagc ccactcttca atgagactct tttcttcaac 
gctgtttgat gagcccatct ttatcacggt ggtagactct 
tctcctcggg gagttccgga tggacgtggg caccatttac 
tctcaggaag tggctgctgc tctcagaccc tgatgacttc 
cctgaaaaca agcctttgtg tgctggggcc tggggacgaa 
cccctctgaa gacaaggagg acattgaaag caacctgctc 
gcgaggagcc cacttctgcc tgaaggtctt ccgggccgag 
tgccgtgatg gacaacgtga aacagatctt tggcttcgag 
ggaccccttt gtggaggtca gctttgcggg gaaaatgctg 
gacggccaac cctcagtgga accagaacat cacactgcct 
cgaaaaaatg aggattcgta tcatagactg ggaccgcctg 
taccacctac ctgagtatgt cgaaaatctc tgcccctgga 
tgcaggtgct gtcaagcctt cgaaagcctc agacttggat 
cacttttggg ccctgctaca tcaacctcta tggcagtccc 
agacccctac acagagctca acacaggcaa gggggaaggt 
tctgctctcc ctggagacca agctggtgga gcacagtgaa 
tgcggatgac atcctccggg tggagaagta ccttaggagg 
ggccttctac tcagccacca tgctgcagga tgtggatgat 
catcgggaac tacgggaaca agttcgacat gacctgcctg 
gtacagccgt gcagtctttg acgggtgcca ctactactac 
acctgtggtg gtgctgtcat cctactggga ggacatcagc 
ccagctgctt gggattgctg accggctgga agctggcctg 
gaaggcgcag tgctccacgg aggacgtgga ctcgctggtg 
catcgcaggc tgcagccagc ctctgggtga catccatgag 
ggaccagtac ctgtaccagc tgcgcaccca tcacctgagc 
ggccctgaag ctcggccaca gtgagctccc tgcagctctg 
cctgcgtctg cgtgccctgg cagaggagcc ccagaacagc 
gatgctgcag ggagacaagc gtgtggcata ccagcgggtg 
ctcccggcgg ggtgccaact actgtggcaa gaattgtggg 
gaaatatccg atggagaagg tgcctggcgc ccggatgcca 
gtggtttggg ctctctgtgg atgagaagga gttcaaccag 
tgtctttgct gaaacctatg agaacgagac taagttggcc 
aacgggcctc acctacccca agttttctga cgtcacgggc 
cagcttccgc ccctcggccg gctggacctg ggctggagat 
gactctgctc catgacatgg acgccggtca cctgagcttc 
ccagacccgg cttcccggag gccagtggat ctacatgagt 
cggggagaag gtgcttccca aggatgacat tgagtgccca 
tgaggaatgg tccacagacc tcaaccgggc tgtcgatgag 



gactccgcag 

ggaagatgag 

cggggcgggg 

agcaactgct 

cagcgagggg 

ccctgggcgc 

gagaacgtcc 

ggggtgaaga 

tttgaatggg 

aaagaccatg 

gaggtcctcg 

aagcagccca 

gtgcccctgt 

gtagtggcag 

gcggagccat 

ccttcacgtc 

agaaagctgc 

cgccagctgc 

aagcggacgc 

ttgtttgact 

cgttctctca 

agagagcccc 

tctgctgggg 

gcgcctctgg 

cggcccacag 

gacttgccgc 

agtaacaaga 

tgcagcaaga 

gccatgtttc 

actcacaatg 

ggagaaatag 

gactacctgg 

agagagttca 

gtggcttatc 

cagaaggtgg 

cgcaagtact 

gccatccagt 

ccgctggcct 

ctaccctggg 

catagaatcg 

gagcaggtcc 

gctcagctga 

acaccctctg 

caaatcactg 

gagcaggcgg 

ctgccggaca 

cccgcccacc 

aagctacaga 

gtgcagatac 

tttgctgagg 

cttgttggga 

aagatcaagc 

tggttcgtgt 

gtggaagagg 

gacaactaca 

ctgggcfcgga 

caaggctggg 



ccggagcatt 

cagaagcccc 

acccagccta 

ctaagccagg 

acccacaagc 

acggggccct 

acacacccga 

agagaaccaa 

acctcaaggg 

agacgatggg 

ccacccctag 

caggggcctc 

tcccgccccc 

acacaggagg 

tcctggatca 

ctccgcccca 

tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 

gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 

ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 

aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
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caccatcccc 

acaccgacgg 

gaaaaggcac 

ctggaagttc 

tcgcatggag 

gggcggcgtg 

tgtgaacaga 

ctacatgtac 

tgccatcgtc 

ccccacctgg 

agttgctgag 

agacgagttt 

ctggttccca 

catccagaga 

aaggatcctg 

caacatctac 

cctggcatgg 

cctcgtggta 

gaaccccaac 

ctactgcccc 

ggtgggccag 

gagtccatcc 

gctcatcgac 

ttggtggagc 

gaaggatttt 

gggcctgtct 

agaagatcca 

agacccagcc 

ggagtgcttg 

tggaaagtgt 

taactacatc 

tctgcctctg 

cgaaaagatc 

tcgctgtgga 

ccgcccctcc 

ccggacagac 

caggatccca 

gcagcagggc 

agacatcgag 

gcctggacct 

tatctggaat 

cgacatttat 

gcgttatcgt 

ctacctgcca 

caagactgag 

ctcctttgat 

caagacagcc 

tgtgtccctt 

tgagaagaaa 

gcatgaggag 

ggacccaagg 

gttcatcctg 

gctgctgttc 

gaagcccttc 

atgggactgg 

attgtcctgc 

acatggagct 

tttggatcag 

aaaaaaaaaa 



ccggagcgga 

cggcgctggg 

aggcaggcgg 

cacctcgagt 

ccactggaga 

atggatgaca 

cccacgattt 

caggcccggg 

tccttcctgc 

gaccagacgc 

caaccgccca 

atgggtcgct 

ctgacgaggg 

gagaagccgg 

gatgagtctg 

atggttcctc 

ggcctgcgga 

gagtgtgggg 

tttgacatct 

cccatcaccg 

tgtaccatcc 

ccacagggtg 

attgatgaca 

aaattctttg 

gacaccctga 

gacttttgta 

tctg-tgattg 

atccccatgc 

gtccgtatct 

gatccttaca 

ccctgcacgc 

gagaaggacc 

ggtgagacgg 

ctcccacaga 

cagctcctcc 

cgtgtaatgt 

aacccacacc 

ctggtcccgg 

caggggaagc 

cccttcaaca 

accagagatg 

gtgaaaggtt 

tccctgggag 

gctgagcaag 

agcaaaatcc 

gattttctgg 

aagaagtgct 

tttgagcaga 

atactggcgg 

cggcctgctg 

cgccccgaca 

tggcggcgtt 

ctggccatct 

agctgaggac 

cctgcctcct 

cagggtgggc 

ctgagatcac 

ctcagacata 



agccgaagca 

tgcgcctgcg 

aggcggaggg 

accgcaagac 

agacggggcc 

agagtgaaga 

cctgcatatt 

acctggctgc 

accagagcca 

tcatcttcta 

gcattgtggt 

gcatctgtca 

gcagccagcc 

ccatccacca 

aggacacaga 

agaacatcaa 

acatgaagag 

gccagacggt 

gcaccctctt 

tcaaggtcat 

gctccctgga 

gcccagacga 

aggagcccct 

cctccatagg 

aggtctatga 

acaccttcaa 

gtgaatttaa 

ccccaagaca 

acattgtccg 

tcaagatctc 

tggagcccgt 

taaagatcac 

tcgtcgacct 

cctactgtgt 

acctcttctg 

ttcaggataa 

tgggcccagt 

agcacgtgga 

tgcagatgtg 

tcaccccacg 

tgatcctgga 

ggatgattgg 

gtgaaggcaa 

tctgtaccat 

cagcacgagt 

gctccctgca 

ccttggacca 

aaacagtgaa 

gcaagctgga 

gccagggccg 

cctccttcct 

tccggtgggc 

tcatctacgc 

tctcctgccc 

ccgcccagct 

agacagacag 

cccacttcca 

tttcagtata 



ctgggtccct 

caggagggat 

cgagggctgg 

agatgccttc 

tgcagctgtg 

ttccatgtcc 

cgactatggg 

gatggacaag 

gaagacggtg 

cgagatcgag 

ggagctgtac 

accgagtctg 

gtcgggggag 

tattcctggt 

cctgccctac 

gccagcgctc 

ttaccagctg 

gcagtcctgt 

catggaagtg 

cgataaccgc 

gagcttcctg 

tgtgagccta 

catccccatc 

ggagagggaa 

cacacagctg 

gctgtaccgg 

gggcctcttc 

gttccaccag 

agcatttggc 

catagggaag 

atttggaaag 

tctctatgac 

ggagaacagg 

ctctggaccg 

ccagcagcat 

agaatattcc 

ggaggagcgt 

gtcacggccc 

ggtcgaccta 

gagagccaga 

tgacctgagc 

ctttgaagaa 

cttcaactgg 

tgccaagaag 

ggtgttccag 

gctcgatctc 

gctggatgat 

gggctggtgg 

aatgaccttg 

ggatgagccc 

gtggtttacc 

catcatcctc 

cttcccgaac 

tgtagaaggg 

cggcgagctc 

atggaccggc 

tcatttcctt 

aaacagttgg 



gctgagaaga 

ctcagccaaa 

gagtacgcct 

cgccgccgcc 

tttgcccttg 

gtctccacct 

aaccgctacc 

gactcttttt 

gtggtgaaga 

atctttggcg 

gaccatgaca 

gaacggatgc 

ctgctggcct 

tttgaggtgc 

ccaccacccc 

cagcgtaccg 

gccaacatct 

gtcatcagga 

atgctgccca 

cagtttggcc 

tgtgacccct 

ctcagtcctg 

caggaggaag 

aagtgcggct 

gagaatgtgg 

ggcaagacgc 

aaaatttatc 

ctggccgccc 

ctgcagccca 

aaatcagtga 

atgttcgagc 

tatgacctcc 

ctgctgtcca 

aaccagtggc 

agagtcaagg 

attgaagaga 

ctggctctgc 

ctctacagcc 

tttccgaagg 

aggtttttcc 

ctcacggggg 

cacaagcaaa 

aggttcattt 

gatgccttct 

atctgggaca 

aaccgcatgc 

gctttccacc 

ccctgtgtag 

gagattgtag 

aacatgaacc 

tccccataca 

ttcatcatcc 

tatgctgcca 

gccgtggggt 

ctccagacct 

ccacactccc 

ctcccccaac 

aaccacaaaa 



tgtactacac 

tggaagcact 

ctctttt-tgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgcrtg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 

cacggctggc 

cttttgagct 

aggagacatc 

agagggaggc 

ccatcgagat 

cctcccccag 

acctccggaa 

gggaggagct 

gccggcctgt 

actcggcgga 

gggaagacgt 

agttcatcga 

cctacctgga 

aggcctttga 

aggaggagac 

ccctcccaga 

agggacccca 

aggaccccaa 

gtgaccagga 

tgacctgcac 

tctccaagga 

agtttggggc 

gggaccagct 

cacctgtgta 

tagaggctgg 

atgtgcttca 

ccctgcagcc 

ccctggggcg 

tgcgttgtat 

agaagatgag 

agacagacgt 

tccccttcga 

ggaggctgga 

atgacaagtt 

ccaagccagc 

cagaatggtt 

cagaagaggg 

cagagagtga 

ctaagcttga 

agaccatgaa 

tcttcatcct 

tgaagctggt 

cccctccagc 

cctaggcctg 

agagttgcta 

ccaacgcttt 

aaaaaaaaaa 



3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6911 



<210> 21 

<211> 6909 

<212> DNA 

<213> Homo sapiens 

<400> 21 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 



60 
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agattacagc 

tgttctcgga 

gcccactgga 

agccagagat 

ggcgcctcgg 

acacgcgcca 

caccgacatc 

agtcatcaag 

catccccctg 

gaggaacagg 

tctgtccgcc 

gctggtcctg 

tactcctctg 

agaggaagac 

aagcggaggc 

ctaccccggg 

accgcaggat 

catcaagcct 

gggaaacagc 

gctgtttgat 

tctcctcggg 

tctcaggaag 

cctgaaaaca 

cccctctgaa 

gcgaggagcc 

tgccgtgatg 

ggaccccttt 

gacggccaac 

cgaaaaaatg 

taccacctac 

tgcaggtgct 

cacttttggg 

agacccctac 

tctgctctcc 

tgcggatgac 

ggccttctac 

catcgggaac 

gtacagccgt 

acctgtggtg 

ccagctgctt 

gaaggcgcag 

catcgcaggc 

ggaccagtac 

ggccctgaag 

cctgcgtctg 

gatgctgcag 

ctcccggcgg 

gaaatatccg 

gtggtttggg 

tgtcrtttgct 

aacgggcctc 

cagcttccgc 

gactctgctc 

ccagacccgg 

cggggagaag 

tgaggaatgg 

caccatcccc 



acaccgacgg 
gaaaaggcac 
ctggaagttc 
tcgcatggag 
gggcggcgtg 
tgtgaacaga 
ctacatgtac 
tgccatcgtc 
ccccacctgg 
agttgctgag 



tcgacggagc 
acgccggctg 
gcagccgggg 
tcgagccggc 
ccctcccgac 
agcatgctga 
agcgatgcct 
aacagcgtga 
gaccagggct 
ttcctggggg 
agcttcaatg 
caggtgtcct 
gagccctccc 
acagaggacc 
ccgggggctc 
atcaaaagaa 
ttccagatca 
gtggtcaagg 
ccactcttca 
gagcccatct 
gagttccgga 
tggctgctgc 
agcctttgtg 
gacaaggagg 
cacttctgcc 
gacaacgtga 
gtggaggtca 
cctcagtgga 
aggattcgta 
ctgagtatgt 
gtcaagcctt 
ccctgctaca 
acagagctca 
ctggagacca 
atcctccggg 
tcagccacca 
tacgggaaca 
gcagtctttg 
gtgctgtcat 
gggattgctg 
tgctccacgg 
tgcagccagc 
ctgtaccagc 
ctcggccaca 
cgtgccctgg 
ggagacaagc 
ggtgccaact 
atggagaagg 
ctctctgtgg 
gaaacctatg 
acctacccca 
ccctcggccg 
catgacatgg 
cttcccggag 
gtgcttccca 
tccacagacc 
ccggagcgga 
cggcgctggg 
aggcaggcgg 
cacctcgagt 
ccactggaga 
atggatgaca 
cccacgattt 
caggcccggg 
tccttcctgc 
gaccagacgc 
caaccgccca 



tcgggaaggg 

acaagcgggg 

gtggcccgtt 

ctcgcccagc 

ctttccgagc 

gggtcttcat 

actgctccgc 

accctgtatg 

ctgagcttca 

aagccaaggt 

cccccctgct 

acacaccgct 

cgactctgcc 

agggactcac 

ccaccacccc 

agcgaagtgc 

gggtccaggt 

ttaccgctgc 

atgagactct 

ttatcacggt 

tggacgtggg 

tctcagaccc 

tgctggggcc 

acattgaaag 

tgaaggtctt 

aacagatctt 

gctttgcggg 

accagaacat 

tcatagactg 

cgaaaatctc 

cgaaagcctc 

tcaacctcta 

acacaggcaa 

agctggtgga 

tggagaagta 

tgctgcagga 

agttcgacat 

acgggtgcca 

cctactggga 

accggctgga 

aggacgtgga 

ctctgggtga 

tgcgcaccca 

gtgagctccc 

cagaggagcc 

gtgtggcata 

actgtggcaa 

tgcctggcgc 

atgagaagga 

agaacgagac 

agttttctga 

gctggacctg 

acgccggtca 

gccagtggat 

aggatgacat 

tcaaccgggc 

agccgaagca 

tgcgcctgcg 

aggcggaggg 

accgcaagac 

agacggggcc 

agagtgaaga 

cctgcatatt 

acctggctgc 

accagagcca 

tcatcttcta 

gcattgtggt 
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cggcgggggt 
tgagcgcagg 
cccctttaag 
cagccctctc 
cctctttgcg 
cctctatgcc 
ggtgtttgca 
gaatgaggga 
tgtggtggtc 
cccactccga 
ggacaccaag 
gcctggagct 
tgacctggat 
tggagatgag 
aaggaaacta 
gcctacatct 
gatcgagggg 
agggcagacc 
tttcttcaac 
ggtagactct 
caccatttac 
tgatgacttc 
tggggacgaa 
caacctgctc 
ccgggccgag 
tggcttcgag 
gaaaatgctg 
cacactgcct 
ggaccgcctg 
tgcccctgga 
agacttggat 
tggcagtccc 
gggggaaggt 

gcacagtgaa 

ccttaggagg 

tgtggatgat 

gacctgcctg 

ctactactac 

ggacatcagc 

agctggcctg 

ctcgctggtg 

catccatgag 

tcacctgagc 

tgcagctctg 

ccagaacagc 

ccagcgggtg 

gaattgtggg 

ccggatgcca 

gttcaaccag 

taagttggcc 

cgtcacgggc 

ggctggagat 

cctgagcttc 

ctacatgagt 

tgagtgccca 

tgtcgatgag 

ctgggtccct 

caggagggat 

cgagggctgg 

agatgccttc 

tgcagctgtg 

ttccatgtcc 

cgactatggg 

gatggacaag 

gaagacggtg 

cgagatcgag 

ggagctgtac 



ggaagatgag 

cggggcgggg 

agcaactgct 

cagcgagggg 

ccctgggcgc 

gagaacgtcc 

ggggtgaaga 

tttgaatggg 

aaagaccatg 

gaggtcctcg 

aagcagccca 

gtgcccctgt 

gtagtggcag 

gcggagccat 

ccttcacgtc 

agaaagctgc 

cgccagctgc 

aagcggacgc 

ttgtttgact 

cgttctctca 

agagagcccc 

tctgctgggg 

gcgcctctgg 

cggcccacag 

gacttgccgc 

agtaacaaga 

tgcagcaaga 

gccatgtttc 

actcacaatg 

ggagaaatag 

gactacctgg 

agagagttca 

gtggcttatc 

cagaaggtgg 

cgcaagtact 

gccatccagt 

ccgctggcct 

ctaccctggg 

catagaatcg 

gagcaggtcc 

gctcagctga 

acaccctctg 

caaatcactg 

gagcaggcgg 

ctgccggaca 

cccgcccacc 

aagctacaga 

gtgcagatac 

tttgctgagg 

cttgttggga 

aagatcaagc 

tggttcgtgt 

gtggaagagg 

gacaactaca 

ctgggctgga 

caaggctggg 

gctgagaaga 

ctcagccaaa 

gagtacgcct 

cgccgccgcc 

tttgcccttg 

gtctccacct 

aaccgctacc 

gactcttttt 

gtggtgaaga 

atctttggcg 

gaccatgaca 



cagaagcccc 
acccagccta 
ctaagccagg 
acccacaagc 
acggggccct 
acacacccga 
agagaaccaa 
acctcaaggg 
agacgatggg 
ccacccctag 
caggggcctc 
tcccgccccc 
acacaggagg 
tcctggatca 
ctccgcccca 
tgtcagacaa 

cgggggtgaa 

ggatccacaa 

ctcctgggga 

ggacagatgc 

ggcacgccta 

ccagaggcta 

agagaaaaga 

gcgtagccct 

agatggacga 

agaacttggt 

tcttggagaa 

cctccatgtg 

acatcgtggc 

aagaggagcc 

gcttcctccc 

caggcttccc 

gtggccggct 

aggaccttcc 

ccctgtttgc 

ttgaggtcag 

ccaccactca 

gtaacgtgaa 

agactcagaa 

acctggccct 

cggatgagct 

ccacccacct 

aggctgccct 

aggactggct 

tcgtcatctg 

aagtcctctt 

caatctttct 

gggtcaagct 

ggaagctgtc 

actggggcac 

tacccaagga 

gtccggagaa 

tgtttgagaa 

ccgatgtgaa 

agtgggaaga 

agtatagcat 

tgtactacac 

tggaagcact 

ctctttttgg 

gctggcgccg 

agggggccct 

tgagcttcgg 

atctacgctg 

ctgatcccta 

acacccttaa 

agccggccac 

cttatggtgc 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
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agacgagttt atgggtcgct gcatctgtca accgagtctg gaacggatgc cacggctggc 4140 

ctggttccca ctgacgaggg gcagccagcc gtcgggggag ctgctggcct cttttgagct 4200 

catccagaga gagaagccgg ccatccacca tattcctggt tttgaggtgc aggagacatc 4260 

aaggatcctg gatgagtctg aggacacaga cctgccctac ccaccacccc agagggaggc 4320 

caacatctac atggttcctc agaacatcaa gccagcgctc cagcgtaccg ccatcgagat 4380 

cctggcatgg ggcctgcgga acatgaagag ttaccagctg gccaacatct cctcccccag 4440 

cctcgtggta gagtgtgggg gccagacggt gcagtcctgt gtcatcagga acctccggaa 4500 

gaaccccaac tttgacatct gcaccctctt catggaagtg atgctgccca gggaggagct 4560 

ctactgcccc cccatcaccg tcaaggtcat cgataaccgc cagtttggcc gccggcctgt 4620 

ggtgggccag tgtaccatcc gctccctgga gagcttcctg tgtgacccct actcggcgga 4680 

gagtccatcc ccacagggtg gcccagacga tgtgagccta ctcagtcctg gggaagacgt 4740 

gctcatcgac attgatgaca aggagcccct catccccatc caggaggaag agttcatcga 4800 

ttggtggagc aaattctttg cctccatagg ggagagggaa aagtgcggct cctacctgga 4860 

gaaggatttt gacaccctga aggtctatga cacacagctg gagaatgtgg aggcctttga 4920 

gggcctgtct gacttttgta acaccttcaa gctgtaccgg ggcaagacgc aggaggagac 4980 

agaagatcca tctgtgattg gtgaatttaa gggcctcttc aaaatttatc ccctcccaga 5040 

agacccagcc atccccatgc ccccaagaca gttccaccag ctggccgccc agggacccca 5100 

ggagtgcttg gtccgtatct acattgtccg agcatttggc ctgcagccca aggaccccaa 5160 

tggaaagtgt gatcct-taca tcaagatctc catagggaag aaatcagtga gtgaccagga 5220 

taactacatc ccctgcacgc tggagcccgt atttggaaag atgttcgagc tgacctgcac 5280 

tctgcctctg gagaaggacc taaagatcac tctctatgac tatgacctcc tctccaagga 5340 

cgaaaagatc ggtgagacgg tcgtcgacct ggagaacagg ctgctgtcca agtttggggc 5400 

tcgctgtgga ctcccacaga cctactgtgt ctctggaccg aaccagtggc gggaccagct 5460 

ccgcccctcc cagctcctcc acctcttctg ccagcagcat agagtcaagg cacctgtgta 5520 

ccggacagac cgtgtaatgt trtcaggataa agaatattcc attgaagaga tagaggctgg 5580 

caggatccca aacccacacc tgggcccagt ggaggagcgt ctggctctgc atgtgcttca 5640 

gcagcagggc ctggtcccgg agcacgtgga gtcacggccc ctctacagcc ccctgcagcc 5700 

agacatcgag caggggaagc tgcagatgtg ggtcgaccta tttccgaagg ccctggggcg 5760 

gcctggacct cccttcaaca tcaccccacg gagagccaga aggtttttcc tgcgttgtat 5820 

tatctggaat accagagatg tgatcctgga tgacctgagc ctcacggggg agaagatgag 5880 

cgacatttat gtgaaaggtt ggatgattgg ctttgaagaa cacaagcaaa agacagacgt 5940 

gcattatcgt tccctgggag gtgaaggcaa cttcaactgg aggttcattt tccccttcga 6000 

ctacctgcca gctgagcaag tctgtaccat tgccaagaag gatgccttct ggaggctgga 6060 

caagactgag caaaatccca gcacgagtgg tgttccagat ctgggacaat gacaagttct 6120 

cctttgatga ttttctgggc tccctgcagc tcgatctcaa ccgcatgccc aagccagcca 6180 

agacagccaa gaagtgctcc ttggaccagc tggatgatgc tttccaccca gaatggtttg 6240 

tgtccctttt tgagcagaaa acagtgaagg gctggtggcc ctgtgtagca gaagagggtg 6300 

agaagaaaat actggcgggc aagctggaaa tgaccttgga gattgtagca gagagtgagc 6360 

atgaggagcg gcctgctggc cagggccggg atgagcccaa catgaaccct aagcttgagg 6420 

acccaaggcg ccccgacacc tccttcctgt ggtttacctc cccatacaag accatgaagt 6480 

tcatcctgtg gcggcgtttc cggtgggcca tcatcctctt catcatcctc ttcatcctgc 6540 

tgctgttcct ggccatcttc atctacgcct tcccgaacta tgctgccatg aagctggtga 6600 

agcccttcag ctgaggactc tcctgccctg tagaaggggc cgtggggtcc cctccagcat 6660 

gggactggcc tgcctcctcc gcccagctcg gcgagctcct ccagacctcc taggcctgat 6720 

tgtcctgcca gggtgggcag acagacagat ggaccggccc acactcccag agttgctaac 6780 

atggagctct gagatcaccc cacttccatc atttccttct cccccaaccc aacgcttttt 6840 

tggatcagct cagacatatt tcagtataaa acagttggaa ccacaaaaaa aaaaaaaaaa 6900 

aaaaaaaaa 6909 



<210> 22 
<211> 20 
<212> DNA 

<213> Homo sapiens 
<400> 22 

tgggacctca agggcatccc 20 

<210> 23 
<211> 20 
<212> DNA 

<213> Homo sapiens 
<400> 23 

accatgctgc aggatgtgga 20 

<210> 24 
<211> 20 
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<212> DNA 

<213> Homo sapiens . 

<400> 24 
gggaggtgaa ggcaacttca 

<210> 25 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 25 
ctcacggggg agaagatgag 

<210> 26 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 26 Q 
ctgtggcggc gtttccggtg 

<210> 27 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 27 
acatcaagga tcctggatga 

<210> 28 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 28 
ctgtggcggc gtttccggtg 

<210> 29 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 29 
acagacgtgc attatcgttc 

<210> 30 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 30 

aagactgaga gcaaaatccc zu 

<210> 31 

<211> 507 

<212> DNA 

<213> Homo sapiens 

<400> 31 

tcgaccgccc agccaggtgc aaaatgccgt gtcattggga gactccgcag ccggagcatt 60 

agattacagc tcgacggagc tcgggaaggg cggcgggggt ggaagatgag cagaagcccc 120 

tgttctcgga acgccggctg acaagcgggg tgagcgcagg cggggcgggg acccagccta 180 

gcccactgga gcagccgggg gtggcccgtt cccctttaag agcaactgct ctaagccagg 240 

agccagagat tcgagccggc ctcgcccagc cagccctctc cagcgagggg acccacaagc 300 

ggcqcctcgg ccctcccgac ctttccgagc cctctttgcg ccctgggcgc acggggccct 360 

acacgcgcca agcatgctga gggtcttcat cctctatgcc gagaacgtcc acacacccga 420 
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caccgacatc agcgatgcct actgctccgc ggtgtttgca ggtaggaggg gccgaccacc 
ctcgccgggg tcggggtggg gtagagg 

<210> 32 

<211> 183 

<212> DNA 

<213> Homo sapiens 

<400> 32 

aaaggcggga tgtgtctctc cattctccct tttgtgtctc ttgtaggggt gaagaagaga 
accaaagtca tcaagaacag cgtgaaccct gtatggaatg aggtatgtga gtttttctcc 
ttccttttct ctctgtctgc tgcagggggc ttgggaggag gtgccttctc agcagtgtcc 
ttg 

<210> 33 

<211> 264 

<212> DNA 

<213> Homo sapiens 

<400> 33 

cattcatgaa tgcctactca gtgccctggt ggcacgaagg tgaaccagac acagtctctt 
ctcctagagg gccataggtt aagatgcctt ttctcttttt cttccaggga tttgaatggg 
acctcaaggg catccccctg gaccagggct ctgagcttca tgtggtggtc aaagaccatg 
agacgatggg gaggaacagg taaggtggcc agaggggggt gctccatggc ttgaaggtgc 
aggtaggatt gtggagtata caga 

<210> 34 

<211> 223 

<212> DNA 

<213> Homo sapiens 

<400> 34 

cagaagagcc agggtgcctt aggctagttt tctacatttg acttctctct cctctcaggt 
tcctggggga agccaaggtc ccactccgag aggtcctcgc cacccctagt ctgtccgcca 
gcttcaatgc ccccctgctg gacaccaaga agcagcccac aggggtaagt gcccatcagc 
ctctgccagg ttaaggtcca aggcattgcc aggtggcttc etc 

<210> 35 

<211> 224 

<212> DNA 

<213> Homo sapiens 

<40O> 35 

cagtggtccg aggccagcgc accaacctgt cccccacgtc tcatctcttc caggcctcgc 
tggtcctgea ggtgtcctac acaccgctgc ctggagctgt gcccctgttc ccgcccccta 
ctcctctgga gccctccccg actctgcctg acctggatgt agtggcaggt gggtagecca 
cgttggcctg gctgggcccc agcaagaatg geeggcagtg gcac 

<210> 36 

<211> 315 

<212> DNA 

<213> Homo sapiens 



480 
507 



60 
120 
180 
183 



60 
120 
180 
240 
264 



60 
120 
180 
223 



60 
120 
180 
224 



<400> 36 

aggggcaggg geagggecag agggecagge ctcattaggg ccctctcctc ttagacacag 
gaggagagga agacacagag gaccagggac tcactggaga tgaggeggag ccattcctgg 
ateaaagegg aggecegggg gctcccacca ccccaaggaa actaccttca cgtcctccgc 
cccactaccc egggatcaaa agaaagegaa gtgcgcctac atctagaaag ctgctgtcag 
acaaaccgca ggatttccag gtgatgaacg ggctttctct gaccccaggc tcctcttcag 
ccatcagctg egggt 



60 
120 
180 
240 
300 
315 



<210> 37 

<211> 249 

<212> DNA 

<213> Homo sapiens 
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<400> 37 ... 

ccagtggtga gatggtccct gagatttctg actcttgggg tggatggtgg gtggtcctta 60 

actcttcccc cttctggctt tcagatcagg gtccaggtga tcgaggggcg ccagctgccg 120 

aaaqtgaaca tcaagcctgt ggtcaaggtt accgctgcag ggcagaccaa gcggacgcgg 180 

atccacaagg gaaacagccc actcttcaat gaggtgggag acatggggca tgagggcaga 240 
accttgtgg 

<210> 38 

<211> 185 

<212> DNA 

<213> Homo sapiens 

<400> 38 

ccctggcctg agggatcagc aggcactgat atgtctctct ttgctctgaa ccaacagact bo 
cttttcttca acttgtttga ctctcctggg gagctgtttg atgagcccat ctttatcacg 120 
gtatgtctca gcagtcaaag tgttctccgt gggctgtatg tatgcacata ggtgtcagtg 180 
cacac 

<210> 39 

<211> 196 

<212> DNA 

<213> Homo sapiens 

<400> 39 

aagagctatt gggttggccg tgtgggccac atgtccctgt gaatgtgagc catgatcttt 60 
ctctgcaggt ggtagactct cgttctctca ggacagatgc tctcctcggg gagttccggg 120 
taattgctta ttttctaaaa gcagtcagtt ctcacttctc cgtgttggtg gagcctctgt 
ggaccatggg cagggg 



180 
196 



60 



<210> 40 
<211> 178 
<212> DNA 
<213> Homo sapiens 

<400> 40 

tggaatcgta taatgcacca cactttattt aacgctttgg cggcaagagt ttgatttgtg 
tctcctctct tgattgcaga tggacgtggg caccatttac agagagcccc gtgagttctc 120 
accactttgg ccgtatcctt gcattttggt tctggaggct gattggggac actcattt 178 

<210> 41 

<211> 231 

<212> DNA 

<213> Homo sapiens 

<400> 41 

ggggtcttct gattctggga tcaccaaagg atgttgtctc tcttagggca cgcctatctc 60 

aggaagtggc tgctgctctc agaccctgat gacttctctg ctggggccag aggctacctg 120 

aaaacaagcc tttgtgtgct ggggcctggg gacgaagcgc ctgtgagtac atttccctgg 180 

gtcttcctta cggtccccca cgcggcactt ggttgcggag gcaccaaacc a 231 

<210> 42 

<211> 247 

<212> DNA 

<213> Homo sapiens 

<400> 42 

gtcaaaaccc tgtgctcagg agcgcatgaa ggaacgtatt tggttttctt tgtagctgga 60 

gagaaaagac ccctctgaag acaaggagga cattgaaagc aacctgctcc ggcccacagg 120 

cgtagccctg cgaggagccc acttctgcct gaaggtcttc cgggccgagg acttgccgca 180 

gagtgcgtgg ggcgcgccct tgggtgggag gtctgcagga ggctggaggc gcagggctgg 240 

tgggggt 247 

<210> 43 
<211> 179 
<212> DNA 
<213> Homo sapiens 
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<400> 43 

caggcagtga ctggtgtgtc cctcttccca gtggacgatg ccgtgatgga caacgtgaaa 60 

cagatctttg gcttcgagag taacaagaag aacttggtgg acccctttgt ggaggtcagc 120 

tttgcgggga aaatggtaag gagcaaggga gcaggagggt tctctcggga ggggacggg 179 

<210> 44 

<211> 202 

<212> DNA 

<213> Homo sapiens 

<400> 44 

ccccggggga gcccagagtc cccatggagc tgatcaactt gtcccctccc tgtgtcttct 60 

agctgtgcag caagatcttg gagaagacgg ccaaccctca gtggaaccag aacatcacac 120 

tgcctgccat ggtgagcctc ctgtccccag caaacccaag gaggcccctg gggctctggg 180 

cttcgggagg tccagggctc ct 202 

<210> 45 

<211> 167 

<212> DNA 

<213> Homo sapiens 

<400> 45 

gggaggggct gttctatctt caaaaggact cttctcccaa cacgcctcta ttccttcctc 60 

agtttccctc catgtgcgaa aaaatgagga ttcgtatcat agactggtga gttctgagtc 120 

ttggagtctt tagggcgggc tgtcctgagg gggcgctccc tcagttt 167 

<210> 46 

<211> 220 

<212> DNA 

<213> Homo sapiens 

<400> 46 

tgtggcctga gttcctttcc tgtgtcaggc cctctctgct cccttgctct ctagggaccg 60 

cctgactcac aatgacatcg tggctaccac ctacctgagt atgtcgaaaa tctctgcccc 120 

tggaggagaa atagaaggta tgttccctct tcgttctgcc ctttgacccc ctgtgctctc 180 

cccccctcta tccagcttac acttctagtt ttgagagttt 220 

<210> 47 

<211> 172 

<212> DNA 

<213> Homo sapiens 

<400> 47 

acagcctgtt catgtaaccc gtccttctcc cagccatgcc caccctaacc ccttttccat 60 

ttctttacgc ttcagaggag cctgcaggtg ctgtcaagcc ttcgaaagcc tcagactgta 120 

cgttgctgtc accttgggga caaccagggg agtggggcct tgggttttgg ct 172 

<210> 48 

<211> 200 

<212> DNA 

<213> Homo sapiens 

<400> 48 

ccgacccctc tgattgccac ttgtgtctcc cagtggatga ctacctgggc ttcctcccca 60 

ctfcttgggcc ctgctacatc aacctctatg gcagtcccag agagttcaca ggcttcccag 120 

acccctacac agagctcaac acaggcaagg taagccggct ggagccctgg caagggcagg 180 

atgccacatg cccaggtggg 200 

<210> 49 

<211> 217 

<212> DNA 

<213> Homo sapiens 

<400> 49 

cctcccctct gtctcccctg ctccttgtga cctgacctcc ctggcagggg gaaggtgtgg 60 

cttatcgtgg ccggcttctg ctctccctgg agaccaagct ggtggagcac agtgaacaga 120 

aggtggagga ccttcctgcg gatgacatcc tccgggtgga ggtgaggggt gtggctctgg 180 
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gtgggagctg ggcgtcgggg cagggaaggg atggcca 217 

<210> 50 

<211> 269 

<212> DNA 

<213> Homo sapiens 

<400> 50 

agcctgggtg cctttctttg ctcctcccgt gaccctctgg tctactctct gctctcagaa 60 

gtaccttagg aggcgcaagt actccctgtt tgcggccttc tactcagcca ccatgctgca 120 

ggatgtggat gatgccatcc agtttgaggt cagcatcggg aactacggga acaagttcga 180 

catgacctgc ctgccgctgg cctccaccac tcagtacagc cgtgcagtct ttgacggtga 240 

ggcagtgctc ctggctggga ccccgatca 269 

<210> 51 

<211> 225 

<212> DNA 

<213> Homo sapiens 

<400> 51 

actcctggca cagcgctcag gcccgtctct ccattccagg gtgccactac tactacctac 60 

cctggggtaa cgtgaaacct gtggtggtgc tgtcatccta ctgggaggac atcagccata 120 

gaatcgagac tcagaaccag ctgcttggga ttgctgaccg gctggtgagt gaaaacttgc 180 

ccaaagctgc acatgcctat gcatgcacct gctacccccg ctgca 225 

<210> 52 

<211> 227 

<212> DNA 

<213> Homo sapiens 

<400> 52 

gggtccagca tgcaccctct gccctgtggt gacacacctg acccttgcct gcccattcca 60 

caggaagctg gcctggagca ggtccacctg gccctgaagg cgcagtgctc cacggaggac 120 

gtggactcgc tggtggctca gctgacggat gagctcatcg caggctgcag gtagggggga 180 

cctggcgccc ctggtgccca cctctcctgg ctcaactggg cctgttt 227 

<210> 53 

<211> 303 

<212> DNA 

<213> Homo sapiens 

<400> 53 

tgggagaccc tgggctcatc aggcgcattc catctgtccg tccctcacag ccagcctctg 60 

ggtgacatcc atgagacacc ctctgccacc cacctggacc agtacctgta ccagctgcgc 120 

acccatcacc tgagccaaat cactgaggct gccctggccc tgaagctcgg ccacagtgag 180 

ctccctgcag ctctggagca ggcggaggac tggctcctgc gtctgcgtgc cctggcagag 240 

gaggtaatta agcctggggg tgcctttctt cttctgctct cctgctgcct ggaacatcag 300 

aac 303 

<210> 54 

<211> 272 

<212> DNA 

<213> Homo sapiens 

<400> 54 

cgtgggcctg gtgtgtcacc atccccaccc cgaccaccac cctctgttca gccccagaac 60 

agcctgccgg acatcgtcat ctggatgctg cagggagaca agcgtgtggc ataccagcgg 120 

gtgcccgccc accaagtcct cttctcccgg cggggtgcca actactgtgg caagaattgt 180 

gggaagctac agacaatctt tctgaaagtg agttttcttt ttccaagtca tgatcgtatt 240 

tccaacataa ggcctttctc ccatctcttg ct 272 



<210> 55 

<211> 219 

<212> DNA 

<213> Homo sapiens 
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<400> 55 



60 



tgtgggtttc tgtccttctt cggtacccag tatccgatgg agaaggtgcc tggcgcccgg 

atgccagtgc agatacgggt caagctgtgg tttgggctct ctgtggatga gaaggagttc 120 

aaccagtttg ctgaggggaa gctgtctgtc tttgctgaaa ccgtgagtac ctgccagccc 180 

ccacctctgc ctcccactac ctggagctgc cttggcccc 219 

<210> 56 

<211> 292 

<212> DNA 

<213> Homo sapiens 

<400> 56 

tgcctcccac tacctggagc tgccttggcc cccttcacgc ctcattcttc ctggccctcc 60 

agtatgagaa cgagactaag ttggcccttg ttgggaactg gggcacaacg ggcctcacct 120 

accccaagtt ttctgacgtc acgggcaaga tcaagctacc caaggacagc ttccgcccct 180 

cggccggctg gacctgggct ggagattggt tcgtgtgtcc ggagaagacg tgagtcgtgg 240 

gcagggaggg ctggggagag ccaggccagg ctgcccacca tggactgcac cc 292 

<210> 57 

<211> 242 

<212> DNA 

<213> Homo sapiens 

<400> 57 

tggatggggg cctctccagc agagcagcag agactctgac cagccctcct ccacagtctg 60 

ctccatgaca tggacgccgg tcacctgagc ttcgtggaag aggtgtttga gaaccagacc 120 

cggcttcccg gaggccagtg gatctacatg agtgacaact acaccgatgt ggtaaagcag 180 

gcactcaggg gcaggtgggg tctagacatt tggtctctgg aggcacctgg tgctcaggga 240 

ca 242 

<210> 58 

<211> 215 

<212> DNA 

<213> Homo sapiens 

<400> 58 

tcacatctgt ctgtctcctc tcattgcttg cctgttcggt tttgtcctta gaacggggag 60 

aaggtgcttc ccaaggatga cattgagtgc ccactgggct ggaagtggga agatgaggaa 120 

tggtccacag acctcaaccg ggctgtcgat gagcaaggtg ggcagcatgt ggaacctggc 180 

gagccccatc cccggcaagc tctcaagcca tgcat 215 

<210> 59 

<211> 246 

<212> DNA 

<213> Homo sapiens 

<400> 59 

agagatggtc ccaggagaga tggggggaag tgccaagcaa tgagtgaccg gttccccctc 60 

ccccaggctg ggagtatagc atcaccatcc ccccggagcg gaagccgaag cactgggtcc 120 

ctgctgagaa gatgtactac acacaccgac ggcggcgctg ggtgcgcctg cgcaggaggg 180 

atctcagcca aatggaagca ctgaaaaagg gtgagccagc aggtggtggg tgggagtgag 240 

gcctgt 246 

<210> 60 

<211> 253 

<212> DNA 

<213> Homo sapiens 

<400> 60 

cttcccaccg gcctctgagt ctgccccttc ttgtgcagca caggcaggcg gaggcggagg 60 

gcgagggctg ggagtacgcc tctctttttg gctggaagtt ccacctcgag taccgcaaga 120 

cagatgcctt ccgccgccgc cgctggcgcc gtcgcatgga gccactggag aagacggggc 180 

ctgcagctgt gtttgccctt gagggggccc tggtatgtgg ggctgcactt gtcctggctt 240 

gggtagggta tat 25 ^ 

<210> 61 
<211> 177 
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<212> DNA 

<213> Homo sapiens 

<400> 61 

gaatctgcca taaccagctt cgtgtctcca gggcggcgtg atggatgaca agagtgaaga 
ttccatgtcc gtctccacct tgagcttcgg tgtgaacaga cccacgattt cctgcatatt 
cgactgtaag taggcttcga ggcctctatg gggtgataag ggtgtgtcac cttatgc 

<210> 62 

<211> 181 

<212> DNA 

<213> Homo sapienB 

<400> 62 

aaccactcca gccactcact ctggcacctc tgttttttcc cttggtgaag atgggaaccg 
ctaccatcta cgctgctaca tgtaccaggc ccgggacctg gctgcgatgg acaaggactc 
tttttctggt aggtgggaga gaggcaggag agtcagagac tgtgggctga gatctgggaa 
t 



60 
120 
177 



60 
120 
180 
181 



<210> 63 

<211> 319 

<212> DNA 

<213> Homo sapiens 



<400> 
ccccacatgg 
ccctctggcc 
gtggtggtga 
gagatcrtttg 
tacgaccatg 
accatgtgca 



63 

ctctggagaa 
atgcagatcc 
agaacaccct 
gcgagccggc 
acacttatgt 
aaggtgggg 



gacatctctc 
ctatgccatc 
taaccccacc 
cacagttgct 
gagtctgccc 



agggtccctg 
gtctccttcc 
tgggaccaga 
gagcaaccgc 
agctcctgcc 



ctgtgtaatg 
tgcaccagag 
cgctcatctt 
ccagcattgt 
tcgtcccctc 



tctcccctcc 
ccagaagacg 
ctacgagatc 
ggtggagctg 
acagggaggg 



<210> 64 

<211> 249 

<212> DNA 

<213> Homo sapiens 

<400> 64 

gccctgggta agggatgctg attcttgtct ctctacgctt ggtctagggt gcagacgagt 
ttatgggtcg ctgcatctgt caaccgagtc tggaacggat gccacggctg gcctggttcc 
cactgacgag gggcagccag ccgtcggggg agctgctggc ctcttttgag ctcatccaga 
gagagaaggt gaggctggtc tatatccaga tccaggaggc ccaggcagga gtggggtggg 
ggccaaccc 

<210> 65 

<211> 158 

<212> DNA 

<213> Homo sapiens 

<400> 65 

cactgacata gtccatgagt gtcatgaggg tgatgggggc cttaggtgac aagcacatga 

ccagagctct cttttcttca ctccagccgg ccatccacca tattcctggt tttgaggtaa 

gtcttgctct gacctttcct tcttcaaact gattgcca 

<210> 66 

<211> 132 

<212> DNA 

<213> Homo sapiens 

<400> 66 

ctttttcccc ttccaacccc tctcaccatc tcctgatgtg cacatcccat ggctgtgggc 
caggtgcagg agacatcaag gatcctggat gaggtgagct ggcggggccg aggtagaggg 
aaggtgaagc ca 

<210> 67 
<211> 216 
<212> DNA 



60 
120 
180 
240 
300 
319 



60 
120 
180 
240 
249 



60 
120 
158 



60 
120 
132 
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<213> Homo sapiens 



<400> 67 

tcttccttcc acctttgtct ccattctacc tgctgtccac tgcagtctga ggacacagac 
ctgccctacc caccacccca gagggaggcc aacatctaca tggttcctca gaacatcaag 
ccagcgctcc agcgtaccgc catcgaggtg agccgtccgg gcctgggcgt gggggctggg 
agcagcctgc ccttcccctt cctggcccca gccttt 

<210> 68 

<211> 263 

<212> DNA 

<213> Homo sapiens 

<400> 68 

cccgggcctt ctgagccact ctcctcattc tgtgtgctta gaatcctggc atggggcctg 
cggaacatga agagttacca gctggccaac atctcctccc ccagcctcgt ggtagagtgt 
gggggccaga cggtgcagtc ctgtgtcatc aggaacctcc ggaagaaccc caactttgac 
atctgcaccc tcttcatgga agtggtgagc cccacctccc tactgtcccc ttccagagtc 
ctggggctag aagttctaca tgt 

<210> 69 

<211> 249 

<212> DNA 

<213> Homo sapiens 

<400> 69 

caggccagtg cgttcttcct cctccaccca gatgctgccc agggaggagc tctactgccc 

ccccatcacc gtcaaggtca tcgataaccg ccagtttggc cgccggcctg tggtgggcca 

gtgtaccatc cgctccctgg agagcttcct gtgtgacccc tactcggcgg agagtccatc 

cccacagggt ggcccaggta ggggaagggg agatgatggg caggtcaggg aagggggagc 
ctagggcaa 

<210> 70 

<211> 180 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 
216 



60 
120 
180 
240 
263 



60 
120 
180 
240 
249 



<400> 70 

aggggcgagc cttttgagag agcccctgtc aggcctggat ggctccctcc cctgcagacg 
atgtgagcct actcagtcct ggggaagacg tgctcatcga cattgatgac aaggagcccc 
tcatccccat ccaggtagga tgggcatcct ccagggaggc ctgggtcacc tttcccctcc 



60 
120 
180 



<210> 71 

<211> 211 

<212> DNA 

<213> Homo sapiens 

<400> 71 

tgctgcttgg cgagtcctgt ttctgaaatg gtctctttct ttctacccac tcaggaggaa 
gagttcatcg attggtggag caaattcttt gcctccatag gggagaggga aaagtgcggc 
-tcctacctgg agaaggattt tgacaccctg aaggtaaggc ctctcttcag tctgacagtc 
ggtgtgtgtg tgcgtgctgg gcagtgggag a 

<210> 72 
<211> 235 
<212> DNA 

<213> Homo sapiens 
<400> 72 

gttctacttt ctttctgtct cttgtcccct cctctaatcc ccatgtgtgg caggtctatg 
acacacagct ggagaatgtg gaggcctttg agggcctgtc tgacttttgt aacaccttca 
agctgtaccg gggcaagacg caggaggaga cagaagatcc atctgtgatt ggtgaattta 
aggtaaatcc tcgaagacgt ccctaaccca ggtgggccta agactgtggt gttgg 

<210> 73 
<211> 268 
<212> DNA 



60 
120 
180 
211 



60 
120 
180 
235 
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<213> Homo sapiens 
<400> 73 

ggggacacag ccaaaccata tcaacaatga tgataaaata aaattaaccc ttccttcttt 
tcagggcctc ttcaaaattt atcccctccc agaagaccca gccatcccca tgcccccaag 
acagttccac cagctggccg cccagggacc ccaggagtgc ttggtccgta tctacattgt 
ccgagcattt ggcctgcagc ccaaggaccc caatggaaag gtaactttct agagccctca 
cctccccaga gtagcaggct caggtaca 

<210> 74 

<211> 200 

<212> DNA 

<213> Homo sapiens 

<400> 74 

tttggaaagt gttttcacag aagtgttttg tctcctcctc cagtgtgatc cttacatcaa 
gatctccata gggaagaaat cagtgagtga ccaggataac tacatcccct gcacgctgga 
gcccgtattt ggaaagtaaa ttggggcatc ttgggtcttg gggtggagga gccagacagg 
ataacccaca gtctagtggg 



60 
120 
180 
240 
268 



60 
120 
180 
200 



<210> 75 

<211> 263 

<212> DNA 

<213> Homo sapiens 

<400> 75 

cctgttccct tgggtgccct gtgttggctg acattcggga atctgcccct tcctgcagga 
tgttcgagct gacctgcact ctgcctctgg agaaggacct aaagatcact ctctatgact 
atgacctcct ctccaaggac gaaaagatcg gtgagacggt cgtcgacctg gagaacaggc 
tgctgtccaa gtttggggct cgctgtggac tcccacagac ctactgtgtg tacgtggatg 
ggggctggct gcctgcttct ctg 

<210> 76 

<211> 237 

<212> DNA 

<213> Homo sapiens 

<400> 76 

aagcatctcg tctatgtctt gtgcttgctc ctcagctctg gaccgaacca gtggcgggac 

cagctccgcc cctcccagct cctccacctc ttctgccagc agcatagagt caaggcacct 

gtgtaccgga cagaccgtgt aatgtttcag gataaagaat attccattga agagataggt 

gagctgccac atgaccccaa accatggtgg gctctcgctg tatccctccc tctctca 

<210> 77 

<211> 245 

<212> DNA 

<213> Homo sapiens 

<400> 77 

tctctcgctt ccccagctcc tgcaactttt ttgtgttctc tctggggcag aggctggcag 
gatcccaaac ccacacctgg gcccagtgga ggagcgtctg gctctgcatg tgcttcagca 
gcagggcctg gtcccggagc acgtggagtc acggcccctc tacagccccc tgcagccaga 
catcgagcag gtaggacctt acccttggtc ccagagtcct cgaactccag aagcccaacc 
ccagg 

<210> 78 

<211> 214 

<212> DNA 

<213> Homo sapiens 

<400> 78 

ggtgcttggt aacagctggt taaatgagaa gggtggggag agaacggacc tgtctccgca 
ggggaagctg gggaagctgc agatgtgggt cgacctattt ccgaaggccc tggggcggcc 
-tggacctccc ttcaacatca ccccacggag agccagaagg tgacttccca gccacaggcrt 
ctgagctggg ctgaggggtg gggcgttgca gcct 



60 
120 
180 
240 
263 



60 
120 
180 
237 



60 
120 
180 
240 
245 



60 
120 
180 
214 
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<210> 79 

<211> 229 

<212> DNA 

<213> Homo sapiens 

<40O> 79 

ttcttaaggc cttcccatcc tttggtagga aatctaggtg gattagagtg atacctttcc 60 

ccaggttttt cctgcgttgt attatctgga ataccagaga tgtgatcctg gatgacctga 120 

gcctcacggg ggagaagatg agcgacattt atgtgaaagg gtagggagcc agcgtcctct 180 

tgcctgtcca gcttcccgca gctcccgtgc tccctctggg ttgtgcaca 229 

<210> 80 

<211> 261 

<212> DNA 

<213> Homo sapiens 

<400> 80 

acgatgtata tactgtgttg gaaatcttaa tgagaactat tctctaaaaa catgtatgtc 60 

tagttggatg attggctttg aagaacacaa gcaaaagaca gacgtgcatt atcgttccct 120 

gggaggtgaa ggcaacttca actggaggtt cattttcccc ttcgactacc tgccagctga 180 

gcaagtctgt accattgcca agaaggtcag tgtccttccg attccctgtg gtgccagcac 240 

cagggcttct aaagttagcc t 261 

<210> 81 

<211> 234 

<212> DNA 

<213> Homo sapiens 

<400> 81 

tgcctctctc taactttgct tccttgcatc cttctctgtt cctcttccgg gtcaggatgc 60 

cttctggagg ctggacaaga ctgagagcaa aatcccagca cgagtggtgt tccagatctg 120 

ggacaatgac aagttctcct ttgatgattt tctggtgatt ttctgggtaa gcgctattgc 180 

tagaatccca ttctgcacat gggggctgcc ccagaaccca cactgtgtgt ttat 234 

<210> 82 

<211> 297 

<212> DNA 

<213> Homo sapiens 

<400> 82 

ggctacaggc tggcagtgat cgagaaaccc ggccaaaaac cacctctctg ttgcaggctc 60 

cctgcagctc gatctcaacc gcatgcccaa gccagccaag acagccaaga agtgctcctt 120 

ggaccagctg gatgatgctt tccacccaga atggtttgtg tccctttttg agcagaaaac 180 

agtgaagggc tggtggccct gtgtagcaga agagggtgag aagaaaatac tggcggtaag 240 

tctacttcct ccagccccag tggagggcat gggggaagct tcttccatag aaattgt 297 

<210> 83 

<211> 237 

<212> DNA 

<213> Homo sapiens 

<400> 83 

cctggttact ctccaggcca ctgagcagag ccttcgtgcc cctaaccaag tgctctctgt 60 

cccctcaggg caagctggaa atgaccttgg agattgtagc agagagtgag catgaggagc 120 

ggcctgctgg ccagggccgg gatgagccca acatgaaccc taagcttgag gacccaaggt 180 

cagtgcccag cccctgagcc ccaatgccca caggtctggg ggtataggca cagtcca 237 

<210> 84 

<211> 252 

<212> DNA 

<213> Homo sapiens 

<400> 84 

ccctagtaaa ggatgcccag ttgactccgg gatctcgctt ccaggcgccc cgacacctcc 60 

ttcctgtggt ttacctcccc atacaagacc atgaagttca tcctgtggcg gcgtttccgg 120 

tgggccatca tcctcttcat catcctcttc atcctgctgc tgttcctggc catcttcatc 180 

tacgccttcc cggtgagcag gcctgacgac actgtggtgg gggaactctg ggtctaatgg 240 
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gggagttcat ca 



252 



<210> 85 

<211> 391 

<212> DNA 

<213> Homo sapiens 

<400> 85 

tggctgtgcc tgccccagtg ggatcaccat gggtccctgt ctcctccctc cctccagaac 60 

tltgctglca tgaagctggt gaagcccttc agctgaggac tctcctgccc tgtagaaggg 120 

gccgtggggt cccctccagc atgggactgg cctgcctcct ccgcccagct cggcgagctc 180 

ctclagacct cctaggcctg attgtcctgc cagggtgggc agacagacag atggaccggc 240 

ccacaltccc agagttgcta acatggagct ctgagatcac cccacttcca tcatttcctt 300 

ctcccccaac ccaacgcttt tttggatcag ctcagacata tttcagtata aaacagttgg 360 
aaccacaaaa aaaaaaaaaa aaaaaaaaaa a 

<210> 86 

<211> 51 

<212> PRT 

<213> Homo sapiens 

<400> 86 

Lys Lys Arg Thr Lys Val He Lys Asn Ser Val Asn Pro Val Trp Asn 

1 5 10 15 

Glu Gly Phe Glu Trp Asp Leu Lys Gly He Pro Leu Asp Gin Gly Ser 

20 25 30 

Glu Leu His Val Val Val Lys Asp His Glu Thr Met Gly Arg Asn Arg 
35 40 45 

Phe Leu Gly 
50 

<210> 87 

<211> 45 

<212> PRT 

<213> Homo sapiens 

<400> 87 

Ser Lys He Leu Glu Lys Thr Ala Asn Pro Gin Trp Asn Gin Asn He 

1 5 10 15 

Thr Leu Pro Ala Met Phe Pro Ser Met Cys Glu Lys Met Arg He Arg 

20 25 30 

He He Asp Trp Asp Arg Leu Thr His Asn Asp He Val 
35 40 45 

<210> 88 

<211> 82 

<212> PRT 

<213> Homo sapiens 

<400> 88 » 
Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe Ser Asp Pro 

1 5 10 15 

Tyr Ala He Val Ser Phe Leu His Gin Ser Gin Lys Thr Val Val Val 

20 2b 30 

Lvs Asn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu He Phe Tyr Glu 

* 35 40 45 

He Glu He Phe Gly Glu Pro Ala Thr Val Ala Glu Gin Pro Pro Ser 

50 55 60 

He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala Asp Glu Phe 
65 70 75 80 

Met Gly 

<210> 89 

<211> 79 

<212> PRT 

<213> Homo sapiens 
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<400> 


89 
























lie 


Tyr 


lie 


Val 


Ara 


Ala 


Phe 


Gly 


Leu 


Gin 


Pro 


Lys 


Asp 


Pro Asn 


Gly 


1 






5 










10 








15 




Lys 


Cys 


Asp 


Pro 


Tyr 


lie 


Lys 


He 


Ser 


He 


Gly 


Lys 


Lys 


Ser Val 


Ser 


20 










25 










30 




Asp 


Gin 


Asp 


Asn 


Tyr 


He 


Pro 


Cys 


Thr 


Leu 


Glu 


Pro 


Val 


Phe Gly 


Lys 




35 










40 










45 






Met 


Phe 


Glu 


Leu 


Thr 


Cys 


Thr 


Leu 


Pro 


Leu 


Glu 


Lys 


Asp 


Leu Lys 


He 




50 








55 










60 








Thr 


Leu 


Tyr 


Asp 


Tyr 


Asp 


Leu 


Leu 


Ser 


Lys 


Asp 


Glu 


Lys 


He Gly 




65 






70 










75 











<210> 90 

<211> 152 

<212> DNA 

<213> Homo sapiens 

<400> 90 

acgatgtata tactgtgttg gaaatcttaa tgagaactat tctctaaaaa catgtatgtc 60 

tagttggatg attggctttg aagaacacaa gcaaaagaca gacgtgcatt atcgttccct 120 

gggaggtgaa ggcaacttca actggaggtt ca 152 

<210> 91 

<211> 56 

<212> DNA 

<213> Homo sapiens 

<400> 91 

gtcagtgtcc ttccgattcc ctgtggtgcc agcaccaggg cttctaaagt tagcct 56 

<210> 92 

<211> 55 

<212> DNA 

<213> Homo sapiens 

<400> 92 

tgcctctctc taactttgct tccttgcatc cttctctgtt cctcttccgg gtcag 55 

<210> 93 

<211> 68 

<212> DNA 

<213> Homo sapiens 



60 
68 



<400> 93 

gtaagcgcta ttgctagaat cccattctgc acatgggggc tgccccagaa cccacactgt 
gtgtttat 

<210> 94 

<211> 56 

<212> DNA 

<213> Homo sapiens 

<400> 94 

ggctacaggc tggcagtgat cgagaaaccc ggccaaaaac cacctctctg ttgcag 56 

<210> 95 

<211> 62 

<212> DNA 

<213> Homo sapiens 

<400> 95 

gtaagtctac ttcctccagc cccagtggag ggcatggggg aagcttcttc catagaaatt 60 

gt ~ 62 

<210> 96 
<211> 68 
<212> DNA 
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<213> Homo sapiens 

<400> 96 „ 
cctggttact ctccaggcca ctgagcagag ccttcgtgcc cctaaccaag tgctctctgt 60 

cccctcag 

<210> 97 

<211> 59 

<212> DNA 

<213> Homo sapiens 

<400> 97 CQ 
gtcagtgccc agcccctgag ccccaatgcc cacaggtctg ggggtatagg cacagtcca 59 

<210> 98 

<211> 44 

<212> DNA 

<213> Homo Bapiens 

<400> 98 AA 
ccctagtaaa ggatgcccag ttgactccgg gatctcgctt ccag 

<210> 99 

<211> 60 

<212> DNA 

<213> Homo sapiens 

<400> 99 , , , n 

gtgagcaggc ctgacgacac tgtggtgggg gaactctggg tctaatgggg gagttcatca 60 

<210> 100 

<211> 57 

<212> DNA 

<213> Homo sapiens 

<400> 100 

tggctgtgcc tgccccagtg ggatcaccat gggtccctgt ctcctccctc cctccag b/ 

<210> 101 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 101 
tctcttctcc tagagggcca tag 

<210> 102 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 102 
ctgttcctcc ccatcgtctc atgg 

<210> 103 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 103 
gctcctcccg tgaccctctg 

<210> 104 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 104 
gggtcccagc caggagcact g 

<210> 105 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 105 
cccctctcac catctcctga tgtg 

<210> 106 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<40O> 106 
tggcttcacc ttccctctac ctcgg 

<210> 107 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 107 
tcctttggta ggaaatctag gtgg 

<210> 108 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 108 
ggaagctgga caggcaagag g 

<210> 109 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 109 
atatactgtg ttggaaatct taatgag 

<210> 110 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 110 
gctggcacca cagggaatcg g 

<210> 111 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 111 
ctttgcttcc ttgcatcctt ctctg 

<210> 112 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 112 
agcccccatg tgcagaatgg g 
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24 



25 



24 



21 



27 



21 



25 



21 
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<210> 113 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 113 
ggcagtgatc gagaaacccg g 

<210> 114 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 114 
catgccctcc actggggctg g 

<210> 115 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 115 
ggatgcccag ttgactccgg g 

<210> 116 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 116 
ccccaccaca gtgtcgtcag g 

<210> 117 

<211> 6240 

<212> DNA 

<213> Homo sapiens 

<400> 117 

atgctgaggg tcttcatcct ctatgccgag aacgtccaca 
gatgcctact gctccgcggt gtttgcaggg gtgaagaaga 
agcgtgaacc ctgtatggaa tgagggattt gaatgggacc 
cagggctctg agcttcatgt ggtggtcaaa gaccatgaga 
ctgggggaag ccaaggtccc acrtccgagag gtcctcgcca 
ttcaatgccc ccctgctgga caccaagaag cagcccacag 
gtgtcctaca caccgctgcc tggagctgtg cccctgttcc 
ccctccccga ctctgcctga cctggatgta gtggcagaca 
gaggaccagg gactcactgg agatgaggcg gagccattcc 
ggggctccca ccaccccaag gaaactacct tcacgtcctc 
aaaagaaagc gaagtgcgcc tacatctaga aagctgctgt 
cagatcaggg tccaggtgat cgaggggcgc cagctgccgg 
gtcaaggtta ccgctgcagg gcagaccaag cggacgcgga 
ctcttcaatg agactctttt cttcaacttg tttgactctc 
cccatcttta tcacggtggt agactctcgt tctctcagga 
ttccggatgg acgtgggcac catttacaga gagccccggc 
ctgctgctct cagaccctga tgacttctct gctggggcca 
ctttgtgtgc tggggcctgg ggacgaagcg cctctggaga 
aaggaggaca ttgaaagcaa cctgctccgg cccacaggcg 
•ttctgcctga aggtcttccg ggccgaggac ttgccgcaga 
aacgtgaaac agatctttgg cttcgagagt aacaagaaga 
gaggtcagct ttgcggggaa aatgctgtgc agcaagatct 
cagtggaacc agaacatcac actgcctgcc atgtttccct 
attcgtatca tagactggga ccgcctgact cacaatgaca 
agtatgtcga aaatctctgc ccctggagga gaaatagaag 
aagcct-tcga aagcctcaga cttggatgac tacctgggct 
tgctacatca acctctatgg cagtcccaga gagttcacag 
gagctcaaca caggcaaggg ggaaggtgtg gcttatcgtg 
gagaccaagc tggtggagca cagtgaacag aaggtggagg 



21 



cacccgacac 
gaaccaaagt 
tcaagggcat 
cgatggggag 
cccctagtct 
gggcctcgct 
cgccccctac 
caggaggaga 
tggatcaaag 
cgccccacta 
cagacaaacc 
gggtgaacat 
tccacaaggg 
ctggggagct 
cagatgctct 
acgcctatct 
gaggctacct 
gaaaagaccc 
tagccctgcg 
tggacgatgc 
acttggtgga 
tggagaagac 
ccatgtgcga 
tcgtggctac 
aggagcctgc 
tcctccccac 
gcttcccaga 
gccggcttct 
accttcctgc 



cgacatcagc 
catcaagaac 
ccccctggac 
gaacaggttc 
gtccgccagc 
ggtcctgcag 
tcctctggag 
ggaagacaca 
cggaggcccg 
ccccgggatc 
gcaggatttc 
caagcctgtg 
aaacagccca 
gtttgatgag 
cctcggggag 
caggaagtgg 
gaaaacaagc 
ctctgaagac 
aggagcccac 
cgtgatggac 
cccctttgtg 
ggccaaccct 
aaaaatgagg 
cacctacctg 
aggtgctgtc 
ttttgggccc 
cccctacaca 
gctctccctg 
ggatgacatc 



21 



21 



21 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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ctccgggtgg agaagtacct taggaggcgc aagtactccc tgtttgcggc cttctactca 1800 

gccaccatgc tgcaggatgt ggatgatgcc atccagtttg aggtcagcat cgggaactac 1860 

gggaacaagt tcgacatgac ctgcctgccg ctggcctcca ccactcagta cagccgtgca 1920 

gtctttgacg ggtgccacta ctactaccta ccctggggta acgtgaaacc tgtggtggtg 1980 

ctgtcatcct actgggagga catcagccat agaatcgaga ctcagaacca gctgcttggg 2040 

attgctgacc ggctggaagc tggcctggag caggtccacc tggccctgaa ggcgcagtgc 2100 

tccacggagg acgtggactc gctggtggct cagctgacgg atgagctcat cgcaggctgc 2160 

agccagcctc tgggtgacat ccatgagaca ccctctgcca cccacctgga ccagtacctg 2220 

taccagctgc gcacccatca cctgagccaa atcactgagg ctgccctggc cctgaagctc 2280 

ggccacagtg agctccctgc agctctggag caggcggagg actggctcct gcgtctgcgt 2340 

gccctggcag aggagcccca gaacagcctg ccggacatcg tcatctggat gctgcaggga 2400 

gacaagcgtg tggcatacca gcgggtgccc gcccaccaag tcctcttctc ccggcggggt 2460 

gccaactact gtggcaagaa ttgtgggaag ctacagacaa tctttctgaa atatccgatg 2520 

gagaaggtgc ctggcgcccg gatgccagtg cagatacggg tcaagctgtg gtttgggctc 2580 

tctgtggatg agaaggagtt caaccagttt gctgagggga agctgtctgt ctttgctgaa 2640 

acctatgaga acgagactaa gttggccctt gttgggaact ggggcacaac gggcctcacc 2700 

taccccaagt tttctgacgt cacgggcaag atcaagctac ccaaggacag cttccgcccc 2760 

tcggccggct ggacctgggc tggagattgg ttcgtgtgtc cggagaagac tctgctccat 2820 

gacatggacg ccggtcacct gagcttcgtg gaagaggtgt ttgagaacca gacccggctt 2880 

cccggaggcc agtggatcta catgagtgac aactacaccg atgtgaacgg ggagaaggtg 2940 

cttcccaagg atgacattga gtgcccactg ggctggaagt gggaagatga ggaatggtcc 3000 

acagacctca accgggctgt cgatgagcaa ggctgggagt atagcatcac catccccccg 3060 

gagcggaagc cgaagcactg ggtccctgct gagaagatgt actacacaca ccgacggcgg 3120 

cgctgggtgc gcctgcgcag gagggatctc agccaaatgg aagcactgaa aaggcacagg 3180 

caggcggagg cggagggcga gggctgggag tacgcctctc tttttggctg gaagttccac 3240 

ctcgagtacc gcaagacaga tgccttccgc cgccgccgct ggcgccgtcg catggagcca 3300 

ctggagaaga cggggcctgc agctgtgttt gcccttgagg gggccctggg cggcgtgatg 3360 

gatgacaaga gtgaagattc catgtccgtc tccaccttga gcttcggtgt gaacagaccc 3420 

acgatttcct gcatattcga ctatgggaac cgctaccatc tacgctgcta catgtaccag 3480 

gcccgggacc tggctgcgat ggacaaggac tctttttctg atccctatgc catcgtctcc 3540 

ttcctgcacc agagccagaa gacggtggtg gtgaagaaca cccttaaccc cacctgggac 3600 

cagacgctca tcttctacga gatcgagatc tttggcgagc cggccacagt tgctgagcaa 3660 

ccgcccagca ttgtggtgga gctgtacgac catgacactt atggtgcaga cgagtttatg 3720 

ggtcgctgca tctgtcaacc gagtctggaa cggatgccac ggctggcctg gttcccactg 3780 

acgaggggca gccagccgtc gggggagctg ctggcctctt ttgagctcat ccagagagag 3840 

aagccggcca tccaccatat tcctggtttt gaggtgcagg agacatcaag gatcctggat 3900 

gagtctgagg acacagacct gccctaccca ccaccccaga gggaggccaa catctacatg 3960 

gttcctcaga acatcaagcc agcgctccag cgtaccgcca tcgagatcct ggcatggggc 4020 

ctgcggaaca tgaagagtta ccagctggcc aacatctcct cccccagcct cgtggtagag 4080 

tgtgggggcc agacggtgca gtcctgtgtc atcaggaacc tccggaagaa ccccaacttt 4140 

gacatctgca ccctcttcat ggaagtgatg ctgcccaggg aggagctcta ctgccccccc 4200 

atcaccgtca aggtcatcga taaccgccag tttggccgcc ggcctgtggt gggccagtgt 4260 

accatccgct ccctggagag cttcctgtgt gacccctact cggcggagag tccatcccca 4320 

cagggtggcc cagacgatgt gagcctactc agtcctgggg aagacgtgct catcgacatt 4380 

gatgacaagg agcccctcat ccccatccag gaggaagagt tcatcgattg gtggagcaaa 4440 

ttctttgcct ccatagggga gagggaaaag tgcggctcct acctggagaa ggattttgac 4500 

accctgaagg tctatgacac acagctggag aatgtggagg cctttgaggg cctgtctgac 4560 

ttttgtaaca ccttcaagct gtaccggggc aagacgcagg aggagacaga agatccatct 4620 

gtgattggtg aatttaaggg cctcttcaaa atttatcccc tcccagaaga cccagccatc 4680 

cccatgcccc caagacagtt ccaccagctg gccgcccagg gaccccagga gtgcttggtc 4740 

cgtatctaca ttgtccgagc atttggcctg cagcccaagg accccaatgg aaagtgtgat 4800 

ccttacatca agatctccat agggaagaaa tcagtgagtg accaggataa ctacatcccc 4860 

tgcacgctgg agcccgtatt tggaaagatg ttcgagctga cctgcactct gcctctggag 4920 

aaggacctaa agatcactct ctatgactat gacctcctct ccaaggacga aaagatcggt 4980 

gagacggtcg tcgacctgga gaacaggctg ctgtccaagt ttggggctcg ctgtggactc 5040 

ccacagacct actgtgtctc tggaccgaac cagtggcggg accagctccg cccctcccag 5100 

ctcctccacc tcttctgcca gcagcataga gtcaaggcac ctgtgtaccg gacagaccgt 5160 

gtaatgtttc aggataaaga atattccatt gaagagatag aggctggcag gatcccaaac 5220 

ccacacctgg gcccagtgga ggagcgtctg gctctgcatg tgcttcagca gcagggcctg 5280 

gtcccggagc acgtggagtc acggcccctc tacagccccc tgcagccaga catcgagcag 5340 

gggaagctgc agatgtgggt cgacctattt ccgaaggccc tggggcggcc tggacctccc 5400 

ttcaacatca ccccacggag agccagaagg tttttcctgc gttgtattat ctggaatacc 5460 

agagatgtga tcctggatga cctgagcctc acgggggaga agatgagcga catttatgtg 5520 

aaaggttgga tgattggctt tgaagaacac aagcaaaaga cagacgtgca ttatcgttcc 5580 

ctgggaggtg aaggcaactt caactggagg ttcattttcc ccttcgacta cctgccagct 5640 

gagcaagtct gtaccattgc caagaaggat gccttctgga ggctggacaa gactgagagc 5700 

aaaatcccag cacgagtggt gttccagatc tgggacaatg acaagttctc ctttgatgat 5760 
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tttctaaact ccctacaqct cgatctcaac cgcatgccca agccagccaa gacagccaag 5820 

aJg?gl?cct tjgaccagct ggatgatgct ttccacccag aatggtttgt gtcccttttt 5880 

galclgaaaa cagtgaaggg ctggtggccc tgtgtagcag aagagggtga gaagaaaata 5940 

ctggcgggca agctggaaat gaccttggag attgtagcag agagtgagca tgaggagcgg 6000 

cctactaacc agggccggga tgagcccaac atgaacccta agcttgagga cccaaggcgc 6060 

cccaacllct clttcctltg gtttacctcc ccatacaaga ccatgaagtt catcctgtgg 6120 

cjgcgtttcc gg^gggcca? catcctcttc atcatcctct tcatcctgct gctgttcctg 6180 

glcatcttca tctacgcctt cccgaactat gctgccatga agctggtgaa gcccttcagc 6240 

<210> 118 

<211> 13 

<212> DNA 

<213> Homo sapiens 

<400> 118 13 
cgcaagcatg ctg 

<210> 119 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 119 12 
gagacgatgg gg 

<210> 120 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 120 21 
gatctaaccc tgctgctcac c 

<210> 121 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 121 21 
ctggtgtgtt gcagagcgct g 

<210> 122 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 122 21 
cctctcttct gctgtcttca g 

<210> 123 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 123 21 
tgtgtctggt tcaccttcgt g 

<210> 124 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 124 
tccaaataga aatgcctgaa c 

<210> 125 
<211> 21 
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<212> DNA 

<213> Homo sapiens 

<400> 125 
aggtatcacc tccaagtgtt g 

<210> 126 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 126 
taccagcttc agagctccct g 

<210> 127 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 127 
ttgatcaggg tgctcttgg 

<210> 128 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 128 
ggagaattgc ttgaacccag 

<210> 129 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 129 
tggctaatga tgttgaacat tt 

<210> 130 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 130 
gacccacaag cggcgcctcg g 

<210> 131 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 131 
gaccccggcg agggtggtcg g 

<210> 132 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 132 
tgtctctcca ttctcccttt tgtg 

<210> 133 

<211> 24 

<212> DNA 

<213> Homo sapiens 
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19 



20 



22 



21 



21 



24 
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<400> 133 

aggacactgc tgagaaggca cctc z * 

<210> 134 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 134 
agtgccctgg tggcacgaag g 

<210> 135 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 135 
cctacctgca ccttcaagcc atgg 

<210> 136 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 136 
cagaagagcc agggtgcctt agg 

<210> 137 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 137 
ccttggacct taacctggca gagg 

<210> 138 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 138 
cgaggccagc gcaccaacct g 

<210> 139 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 139 

actgccggcc attcttgctg gg zz 

<210> 140 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 140 

ccaggcctca ttagggccct c 21 

<210> 141 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 141 

ctgaagagga gcctggggtc ag 22 
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<210> 142 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 142 
ctgagafcttc tgactcttgg ggtg 

<210> 143 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 143 
aaggttctgc cctcatgccc catg 

<210> 144 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 144 
ctggcctgag ggatcagcag g 

<210> 145 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 145 
gtgcatacat acagcccacg gag 

<210> 146 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 146 
gagctattgg gttggccgtg tggg 

<210> 147 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 147 
accaacacgg agaagtgaga actg 

<210> 148 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 148 
ccacacttta tttaacgctt tggcgg 

<210> 149 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 149 
cagaaccaaa atgcaaggat acgg 

<210> 150 
<211> 25 
<212> DNA 
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<213> Homo sapiens 

<400> 150 
cttctgattc tgggatcacc aaagg 

<210> 151 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 151 22 
ggaccgtaag gaagacccag gg 

<210> 152 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 152 24 
cctgtgctca ggagcgcatg aagg 

<210> 153 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 153 22 
gcagacctcc cacccaaggg eg 

<210> 154 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 154 24 
gagacagatg ggggacagtc aggg 

<210> 155 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 155 
cctcccgaga gaaccctcct g 

<210> 156 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 156 21 
gggagcccag agtccccatg g 

<210> 157 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 157 
gggcctcctt gggtttgctg g 



<210> 158 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 158 
gcctccccag catcctgccg g 

<210> 159 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 159 
tcactgagcc gaatgaaact gagg 

<210> 160 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 160 
tgtggcctga gttcctttcc tgtg 

<210> 161 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 161 
ggtcaaaggg cagaacgaag aggg 

<210> 162 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 162 
cccgtccttc tcccagccat g 

<210> 163 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 163 
ctcccctggt tgtccccaag g 

<210> 164 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 164 
cgacccctct gattgccact tgtg 

<210> 165 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 165 
ggcatcctgc ccttgccagg g 

<210> 166 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 166 
tctgtctccc ctgctccttg 



21 



24 



24 



24 



21 



21 



24 



21 



20 
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<210> 167 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 167 
cttccctgcc ccgacgccca g 

<210> 168 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 168 2X 
cagcgcrtcag gcccgtctct c 

<210> 169 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 169 
tgcataggca tgtgcagctt tggg 

<210> 170 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 170 
catgcaccct ctgccctgtg g 

<210> 171 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 171 21 
agttgagcca ggagaggtgg g 

<210> 172 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 172 
catcaggcgc attccatctg tccg 

<210> 173 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 173 
agcaggagag cagaagaaga aagg 

<210> 174 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 174 

gtgtgtcacc atccccaccc eg zz 

<210> 175 
<211> 25 
<212> DNA 
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<213> Homo sapiens 

<400> 175 
caagagatgg gagaaaggcc ttatg 

<210> 176 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 176 
ctgggacatc cggatcctga agg 

<210> 177 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 177 
tccaggtagt gggaggcaga gg 

<210> 178 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 178 
tcccactacc tggagctgcc ttgg 

<210> 179 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 179 
ggctctcccc agccctccct g 

<210> 180 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 180 
cagagcagca gagactctga ccag 

<210> 181 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 181 
tagaccccac ctgcccctga g 

<210> 182 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 182 
tcctctcatt gcttgcctgt tcgg 



25 



23 



22 



24 



21 



24 



21 



24 



<210> 183 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<4O0> 183 

ttgagagctt gccggggatg g 21 

<210> 184 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 184 

aagtgccaag caatgagtga ccgg 24 

<210> 185 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 185 

ctcactccca cccaccacct g 21 

<210> 186 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 186 

cccaccggcc tctgagtctg c 2i 

<210> 187 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 187 

accctaccca agccaggaca agtg 24 

<210> 188 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 188 

gaatctgcca taaccagctt cgtg 24 

<210> 189 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 189 

tatcacccca tagaggcctc gaag 24 

<210> 190 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 190 

cagccactca ctctggcacc tctg 24 

<210> 191 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 191 

agcccacagt ctctgactct cctg 24 
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<210> 192 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 192 
acatctctca gggtccctgc tgtg 

<210> 193 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 193 
cctgtgaggg gacgaggcag g 

<210> 194 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 194 
gccctgggta agggatgctg attc 

<210> 195 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 195 
cctgcctggg cctcctggat c 

<210> 196 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 196 
gagggtgatg ggggccttag g 

<210> 197 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 197 
gcaatcagtt tgaagaagga aagg 

<210> 198 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 198 
cacctttgtc tccattctac ctgc 

<210> 199 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 199 
ctcccagccc ccacgcccag g 

<210> 200 
<211> 24 
<212> DNA 



24 



21 



24 



21 



21 



24 



24 



21 
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<213> Homo sapiens 
<400> 200 

ctgagccact ctcctcattc tgtg 24 

<210> 201 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 201 

tggaagggga cagtagggag g 21 

<210> 202 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 202 

ggccagtgcg ttcttcctcc tc 22 

<210> 203 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 203 

tccctgacct gcccatcatc tc 22 

<210> 204 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 204 

gcccctgtca ggcctggatg g 21 

<210> 205 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 205 

tgacccaggc ctccctggag g 21 

<210> 206 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 206 

ctgaaatggt ctctttcttt ctac 24 

<210> 207 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 207 

cacaccgact gtcagactga agag 24 

<210> 208 

<211> 24 

<212> DNA 

<213> Homo sapiens 
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<400> 208 
ttgtcccctc ctctaatccc catg 

<210> 209 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 209 
gggttaggga cgtcttcgag g 

<210> 210 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 210 
cagccaaacc atatcaacaa tg 

<210> 211 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 211 
ctggggaggt gagggctcta g 

<210> 212 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 212 
gaagtgtttt gtctcctcct c 

<210> 213 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 213 
gcaggcagcc agcccccatc 

<210> 214 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 214 
gggtgccctg tgttggctga c 

<210> 215 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 215 
gcaggcagcc agcccccatc 

<210> 216 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 216 
ctcgtctatg tcttgtgctt gctc 
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24 



21 



22 



21 



21 



20 



21 



20 



24 
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<210> 217 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 217 

caccatggtt tggggtcatg tgg 23 

<210> 218 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 218 

tctcgcttcc ccagctcctg c 21 

<210> 219 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 219 

tctggagttc gaggactctg gg 22 

<210> 220 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 220 

agaagggtgg ggagagaacg g 21 

<210> 221 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 221 

cagctcagag cctgtggctg g 21 

<210> 222 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 222 

aaggccttcc catcctttgg tagg 24 

<210> 223 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 223 

acaacccaga gggagcacgg g 21 

<210> 224 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 224 

gttgacgatg tatatactgt gttgg 2 5 

<210> 225 
<211> 25 
<212> DNA 
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<213> Homo sapiens 
<400> 225 

gcctctctct aactttgctt ccttg 25 

<210> 226 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 226 

ggctacaggc tggcagtgat cgag 24 

<210> 227 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 227 

ttcccccatg ccctccactg g 21 

<210> 228 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 228 

agccttcgtg cccctaacca agtg 24 

<210> 229 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 229 

ctgtgggcat tggggctcag g 21 

<210> 230 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 230 

gccccagtgg gatcaccatg 20 

<210> 231 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 231 

atgctggagg ggaccccacg g 21 

<210> 232 

<211> 3671 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (418) . . . (3381) 
<400> 232 

tcctggttca agcgattctc tggcctcagc ctcccgagta gctgggatta caggcatgct 60 
ccaccaagcc cgggtaattt tgtattttta atagagacgg ggttttgcca tgttggtcag 120 
gctggtctcg aactcctgac ctcaggtgat ctgcccacct tggcctccca acgtgctgag 180 
attacaggca tgagtcactg tgcccggcag agatggtcta attcatatga aagaactctg 240 



BNSDOCID: <WO 0011157A1_IA> 



WO 00/11157 



PCT/US99/19395 



63/68 

aaaaaagtag aaagtgattt tctaaaataa ggtacaaata attaatgtaa gcataatcac 300 
ctaaccttgt ggaatttttt ttttttgaga agcaaattgc aaatttgtga tagatctaaa 360 
ggagattgac taagagggtg accatctgga aatgacgtca tgtgagaatg gttaaag atg 420 

Met 
1 

etc ggg aga ttg age eta gag aaa gga aga ttt gtg aac cca gga ggc 468 
Leu Gly Arg Leu Ser Leu Glu Lys Gly Arg Phe Val Asn Pro Gly Gly 
5 10 15 

aga ggt aga gat cca gga gag ggc ggc gtg atg gat gac aag agt gaa 516 
Arg Gly Arg Asp Pro Gly Glu Gly Gly Val Met Asp Asp Lys Ser Glu 
20 25 30 

aat tec atg tec gtc tec acc ttg age ttc ggt gtg aac aga ccc acg 
Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly Val Asn Arg Pro Thr 
35 40 45 



att tec tgc ata ttc gac tat ggg aac cgc tac cat eta cgc tgc tac 
He Ser Cys He Phe Asp Tyr Gly Asn Arg Tyr His Leu Arg Cys Tyr 

55 60 65 



50 



ctg ctg gec tct ttt gag etc ate cag aga gag aag ccg gec ate cac 
Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu Lys Pro Ala He His 
180 185 190 

cat att cct ggt ttt gag gtg cag gag aca tea agg ate ctg gat gag 
His He Pro Gly Phe Glu Val Gin Glu Thr Ser Arg He Leu Asp Glu 
195 "* 200 205 

tct gag gac aca gac ctg ccc tac cca cca ccc cag agg gag gee aac 
Ser Glu Asp Thr Asp Leu Pro Tyr Pro Pro Pro Gin Arg Glu Ala Asn 
210 215 220 225 

ate tac atg gtt cct cag aac ate aag cca gcg etc cag cgt acc gee 
He Tyr Met Val Pro Gin Asn He Lys Pro Ala Leu Gin Arg Thr Ala 
230 235 240 



564 



612 



660 



atg tac cag gee egg gac ctg get gcg atg gac aag gac tct ttt tct 
Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe Ser 

70 75 80 

gat ccc tat gee ate gtc tec ttc ctg cac cag age cag aag acg gtg 
Asp Pro Tyr Ala He Val Ser Phe Leu His Gin Ser Gin Lys Thr Val 
85 90 95 

gtg gtg aag aac acc ctt aac ccc acc tgg gac cag acg etc ate ttc 
Val Val Lys Asn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu He Phe 
100 105 HO 

tac gag ate gag ate ttt ggc gag ccg gee aca gtt get gag caa ccg 
Tyr Glu He Glu He Phe Gly Glu Pro Ala Thr Val Ala Glu Gin Pro 
115 120 125 

ccc age att gtg gtg gag ctg tac gac cat gac act tat ggt gca gac 
Pro Ser He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala Asp 
130 135 140 145 

gag ttt atg ggt cgc tgc ate tgt caa ccg agt ctg gaa egg atg cca 
Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser Leu Glu Arg Met Pro 
150 155 160 

egg ctg gee tgg ttc cca ctg acg agg ggc age cag ccg teg ggg gag 948 
Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser Gin Pro Ser Gly Glu 
165 170 175 



708 



756 



804 



852 



900 



996 



1044 



1092 



1140 
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ate gag ate etg gca tgg ggc etg egg aac atg aag agt tac cag etg 1188 
lie Glu lie Leu Ala Trp Gly Leu Arg Asn Met Lys Ser Tyr Gin Leu 
245 250 255 

gec aac ate tec tec ccc age etc gtg gta gag tgt ggg ggc cag acg 1236 
Ala Asn lie Ser Ser Pro Ser Leu Val Val Glu Cys Gly Gly Gin Thr 
260 265 270 



gtg cag tec tgt gtc ate agg aac etc egg aag aac ccc aac ttt gac 
Val Gin Ser Cys Val He Arg Asn Leu Arg Lys Asn Pro Asn Phe Asp 
275 280 285 



gag aat gtg gag gee ttt gag ggc etg tct gac ttt tgt aac acc ttc 
Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp Phe Cys Asn Thr Phe 
420 425 430 



1284 



ate tgc acc etc ttc atg gaa gtg atg etg ccc agg gag gag etc tac 1332 
He Cys Thr Leu Phe Met Glu Val Met Leu Pro Arg Glu Glu Leu Tyr 
290 295 300 305 



1380 



1428 



tgc ccc ccc ate acc gtc aag gtc ate gat aac cgc cag ttt ggc cgc 
Cys Pro Pro He Thr Val Lys Val He Asp Asn Arg Gin Phe Gly Arg 
310 315 320 

egg cct gtg gtg ggc cag tgt acc ate cgc tec etg gag age ttc etg 
Arg Pro Val Val Gly Gin Cys Thr He Arg Ser Leu Glu Ser Phe Leu 
325 ~ 330 335 

tgt gac ccc tac teg gcg gag agt cca tec cca cag ggt ggc cca gac 1476 
Cys Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro Gin Gly Gly Pro Asp 
340 345 350 

gat gtg age eta etc agt cct ggg gaa gac gtg etc ate gac att gat 1524 
Asp Val Ser Leu Leu Ser Pro Gly Glu Asp Val Leu He Asp He Asp 
355 360 365 

gac aag gag ccc etc ate ccc ate cag gag gaa gag ttc ate gat tgg 1572 
Asp Lys Glu Pro Leu He Pro He Gin Glu Glu Glu Phe He Asp Trp 
370 375 380 385 

tgg age aaa ttc ttt gec tec ata ggg gag agg gaa aag tgc ggc tec 
Trp Ser Lys Phe Phe Ala Ser He Gly Glu Arg Glu Lys Cys Gly Ser 
390 395 400 

tac etg gag aag gat ttt gac acc etg aag gtc tat gac aca cag etg 1668 
Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val Tyr Asp Thr Gin Leu 
405 * 410 415 



1620 



1716 



aag etg tac egg ggc aag acg cag gag gag aca gaa gat cca tct gtg 1764 
Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr Glu Asp Pro Ser Val 
435 ' 440 445 

att ggt gaa ttt aag ggc etc ttc aaa att tat ccc etc cca gaa gac 1812 
He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr Pro Leu Pro Glu Asp 
450 455 460 465 

cca gee ate ccc atg ccc cca aga cag ttc cac cag etg gec gee cag 
Pro Ala He Pro Met Pro Pro Arg Gin Phe His Gin Leu Ala Ala Gin 
470 475 480 

gga ccc cag gag tgc ttg gtc cgt ate tac att gtc cga gca ttt ggc 
Gly Pro Gin Glu Cys Leu Val Arg He Tyr He Val Arg Ala Phe Gly 
485 490 495 

etg cag ccc aag gac ccc aat gga aag tgt gat cct tac ate aag ate 1956 
Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp Pro Tyr He Lys He 
500 505 510 



1860 



1908 
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tec ata ggg aag aaa tea gtg agt gac cag gat aac tac ate ccc tgc 
Ser lie Gly Lys Lys Ser Val Ser Asp Gin Asp Asn Tyr He Pro Cys 
515 ^ 520 525 

acg ctg gag ccc gta ttt gga aag atg ttc gag ctg acc tgc act ctg 
Thr Leu Glu Pro Val Phe Gly Lys Met Phe Glu Leu Thr Cys Thr Leu 
530 535 540 545 

cct ctg gag aag gac eta aag ate act etc tat gac tat gac etc etc 
Pro Leu Glu Lys Asp Leu Lys He Thr Leu Tyr Asp Tyr Asp Leu Leu 
550 555 560 

tec aag gac gaa aag ate ggt gag acg gtc gtc gac ctg gag aac agg 
Ser Lvs Asp Glu Lys He Gly Glu Thr Val Val Asp Leu Glu Asn Arg 
565 570 575 

ctg ctg tec aag ttt ggg get cgc tgt gga etc cca cag acc tac tgt 2196 
Leu Leu Ser Lys Phe Gly Ala Arg Cys Gly Leu Pro Gin Thr Tyr Cys 
580 585 590 

gtc tct gga ccg aac cag tgg egg gac cag etc cgc ccc tec cag etc 
Val Ser Gly Pro Asn Gin Trp Arg Asp Gin Leu Arg Pro Ser Gin Leu 
595 600 605 

etc cac etc ttc tgc cag cag cat aga gtc aag gca cct gtg tac egg 
Leu His Leu Phe Cys Gin Gin His Arg Val Lys Ala Pro Val Tyr Arg 
610 615 620 625 

aca gac cgt gta atg ttt cag gat aaa gaa tat tec att gaa gag ata 
Thr Asp Arg Val Met Phe Gin Asp Lys Glu Tyr Ser He Glu Glu He 
630 635 640 

gag get ggc agg ate cca aac cca cac ctg ggc cca gtg gag gag cgt 
Glu Ala Gly Arg He Pro Asn Pro His Leu Gly Pro Val Glu Glu Arg 
645 650 655 



2244 



2292 



2340 



2388 



ctg get ctg cat gtg ctt cag cag cag ggc ctg gtc ccg gag cac gtg 2436 
Leu Ala Leu His Val Leu Gin Gin Gin Gly Leu Val Pro Glu His Val 
660 665 670 

gag tea egg ccc etc tac age ccc ctg cag cca gac ate gag cag ggg 2484 
Glu Ser Arg Pro Leu Tyr Ser Pro Leu Gin Pro Asp He Glu Gin Gly 
675 680 685 

aag ctg cag atg tgg gtc gac eta ttt ccg aag gec ctg ggg egg cct 2532 
Lvs Leu Gin Met Trp Val Asp Leu Phe Pro Lys Ala Leu Gly Arg Pro 
690 695 700 705 

gga cct ccc ttc aac ate acc cca egg aga gee aga agg ttt ttc ctg 2580 
Gly Pro Pro Phe Asn He Thr Pro Arg Arg Ala Arg Arg Phe Phe Leu 
710 715 720 

cgt tgt att ate tgg aat acc aga gat gtg ate ctg gat gac ctg age 
Aro Cvs He He Trp Asn Thr Arg Asp Val He Leu Asp Asp Leu Ser 
725 730 735 

etc acg ggg gag aag atg age gac att tat gtg aaa ggt tgg atg att 267 6 

Leu Thr Gly Glu Lys Met Ser Asp He Tyr Val Lys Gly Trp Met He 
740 ~ 745 750 

ggc ttt gaa gaa cac aag caa aag aca gac gtg cat tat cgt tec ctg 2724 
Gly Phe Glu Glu His Lys Gin Lys Thr Asp Val His Tyr Arg Ser Leu 
755 760 765 



2628 
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gga ggt gaa ggc aac ttc aac tgg agg ttc att ttc ccc ttc gac tac 2772 

Gly Gly Glu Gly Asn Phe Asn Trp Arg Phe lie Phe Pro Phe Asp Tyr 
770 ~ 775 780 785 

ctg cca get gag caa gtc tgt acc att gec aag aag gat gec ttc tgg 
Leu Pro Ala Glu Gin Val Cys Thr He Ala Lys Lys Asp Ala Phe Trp 
790 795 800 

agg ctg gac aag act gag age aaa ate cca gca cga gtg gtg ttc cag 
Ara Leu Asp Lys Thr Glu Ser Lys He Pro Ala Arg Val Val Phe Gin 
805 810 815 

ate tgg gac aat gac aag ttc tec ttt gat gat ttt ctg ggc tec ctg 
He Trp Asp Asn Asp Lys Phe Ser Phe Asp Asp Phe Leu Gly Ser Leu 
820 825 830 

cag etc gat etc aac cgc atg ccc aag cca gec aag aca gec aag aag 
Gin Leu Asp Leu Asn Arg Met Pro Lys Pro Ala Lys Thr Ala Lys Lys 
835 840 845 

tgc tec ttg gac cag ctg gat gat get ttc cac cca gaa tgg ttt gtg 
Cys Ser Leu Asp Gin Leu Asp Asp Ala Phe His Pro Glu Trp Phe Val 
850 855 860 865 

tec ctt ttt gag cag aaa aca gtg aag ggc tgg tgg ccc tgt gta gca 3060 
Ser Leu Phe Glu Gin Lys Thr Val Lys Gly Trp Trp Pro Cys Val Ala 
870 875 880 



gaa gag ggt gag aag aaa ata ctg gcg ggc aag ctg gaa atg acc ttg 
Glu Glu Gly Glu Lys Lys He Leu Ala Gly Lys Leu Glu Met Thr Leu 
885 890 895 



2820 



2868 



2916 



2964 



3012 



3108 



3252 



3300 



gag att gta gca gag agt gag cat gag gag egg cct get ggc cag ggc 3156 
Glu He Val Ala Glu Ser Glu His Glu Glu Arg Pro Ala Gly Gin Gly 
900 905 910 

egg gat gag ccc aac atg aac cct aag ctt gag gac cca agg cgc ccc 3204 
Arg Asp Glu Pro Asn Met Asn Pro Lys Leu Glu Asp Pro Arg Arg Pro 
915 920 925 

gac acc tec ttc ctg tgg ttt acc tec cca tac aag acc atg aag ttc 
Asp Thr Ser Phe Leu Trp Phe Thr Ser Pro Tyr Lys Thr Met Lys Phe 
930 935 940 945 

ate ctg tgg egg cgt ttc egg tgg gee ate ate etc ttc ate ate etc 
He Leu Trp Arg Arg Phe Arg Trp Ala He He Leu Phe He He Leu 
950 955 960 

ttc ate ctg ctg ctg ttc ctg gec ate ttc ate tac gec ttc ccg aac 3348 
Phe He Leu Leu Leu Phe Leu Ala He Phe He Tyr Ala Phe Pro Asn 
965 970 975 

tat get gec atg aag ctg gtg aag ccc ttc age tgaggactct cctgccctgt 3401 
Tyr Ala Ala Met Lys Leu Val Lys Pro Phe Ser 
980 985 

agaaggggee gtggggtccc ctccagcatg ggactggcct gcctcctccg cccagctcgg 3461 

cgagctcctc cagacctcct aggectgatt gtcctgccag ggtgggcaga cagacagatg 3521 

gaccggccca cactcccaga gttgetaaca tggagctctg agatcacccc acttccatca 3581 

tttccttctc ccccaaccca aegctttttt ggatcagctc agacatattt cagtataaaa 3641 

cagttggaac cacaaaaaaa aaaaaaaaaa 3671 

<210> 233 
<211> 988 
<212> PRT 

<213> Homo sapiens 
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Met Leu^Gly Arg Leu Ser Leu Glu Lys Gly Arg Phe Val Asn Pro Gly 

1 5 10 15 

Gly Arg Gly Arg Asp Pro Gly Glu Gly Gly Val Met Asp Asp Lys Ser 

20 25 30 

Glu Asp Ser Met Ser Val Ser Thr Leu Ser Phe Gly Val Asn Arg Pro 

35 40 45 

Thr He Ser Cys He Phe Asp Tyr Gly Asn Arg Tyr His Leu Arg Cys 

50 55 60 

Tyr Met Tyr Gin Ala Arg Asp Leu Ala Ala Met Asp Lys Asp Ser Phe 

Ser Asp Pro Tyr Ala He Val Ser Phe Leu His Gin Ser Gin Lys Thr 

Val Val Val Lys ABn Thr Leu Asn Pro Thr Trp Asp Gin Thr Leu He 

100 105 HO 

Phe Tyr Glu He Glu He Phe Gly Glu Pro Ala Thr Val Ala Glu Gin 

115 120 125 

Pro Pro Ser He Val Val Glu Leu Tyr Asp His Asp Thr Tyr Gly Ala 

130 135 140 

Asp Glu Phe Met Gly Arg Cys He Cys Gin Pro Ser Leu Glu Arg Met 
145 150 155 160 

Pro Arg Leu Ala Trp Phe Pro Leu Thr Arg Gly Ser Gin Pro Ser Gly 

165 170 175 

Glu Leu Leu Ala Ser Phe Glu Leu He Gin Arg Glu Lys Pro Ala lie 

180 185 190 

His His He Pro Gly Phe Glu Val Gin Glu Thr Ser Arg He Leu Asp 

195 200 205 

Glu Ser Glu Asp Thr Asp Leu Pro Tyr Pro Pro Pro Gin Arg Glu Ala 

210 215 220 

Asn He Tyr Met Val Pro Gin Asn He Lys Pro Ala Leu Gin Arg Thr 
225 230 235 240 

Ala He Glu He Leu Ala Trp Gly Leu Arg Asn Met Lys Ser Tyr Gin 

245 250 255 

Leu Ala Asn He Ser Ser Pro Ser Leu Val Val Glu Cys Gly Gly Gin 

260 265 270 

Thr Val Gin Ser Cys Val He Arg Asn Leu Arg Lys Asn Pro Asn Phe 

275 280 285 

Asi> He Cvs Thr Leu Phe Met Glu Val Met Leu Pro Arg Glu Glu Leu 

290 295 300 

Tvr Cys Pro Pro He Thr Val Lys Val He Asp Asn Arg Gin Phe Gly 
305 310 315 320 

Arg Arg Pro Val Val Gly Gin Cys Thr He Arg Ser Leu Glu Ser Phe 

325 330 335 

Leu Cvs Asp Pro Tyr Ser Ala Glu Ser Pro Ser Pro Gin Gly Gly Pro 

* 340 345 350 

Asp Asp Val Ser Leu Leu Ser Pro Gly Glu Asp Val Leu He Asp He 

355 360 365 

Asp Asp Lys Glu Pro Leu He Pro He Gin Glu Glu Glu Phe He Asp 

370 375 380 

Trp Trp Ser Lys Phe Phe Ala Ser He Gly Glu Arg Glu Lys Cys Gly 
385 390 395 400 

Ser Tyr Leu Glu Lys Asp Phe Asp Thr Leu Lys Val Tyr Asp Thr Gin 

405 410 415 

Leu Glu Asn Val Glu Ala Phe Glu Gly Leu Ser Asp Phe Cys Asn Thr 

420 425 430 

Phe Lys Leu Tyr Arg Gly Lys Thr Gin Glu Glu Thr Glu Asp Pro Ser 

435 ~* 440 445 

Val He Gly Glu Phe Lys Gly Leu Phe Lys He Tyr Pro Leu Pro Glu 

450 * 455 460 

Asp Pro Ala He Pro Met Pro Pro Arg Gin Phe His Gin Leu Ala Ala 
465 470 475 480 

Gin Gly Pro Gin Glu Cys Leu Val Arg He Tyr He Val Arg Ala Phe 

485 490 495 

Glv Leu Gin Pro Lys Asp Pro Asn Gly Lys Cys Asp Pro Tyr He Lys 

500 505 510 

He Ser He Gly Lys Lys Ser Val Ser Asp Gin Asp Asn Tyr He Pro 
515 520 525 
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Cys Thr Leu Glu Pro Val 
530 

Leu Pro Leu Glu Lys Asp 
545 550 
Leu Ser Lys Asp Glu Lys 
565 

Arg Leu Leu Ser Lys Phe 
580 

Cys Val Ser Gly Pro Asn 
595 

Leu Leu His Leu Phe Cys 
610 

Arg Thr Asp Arg Val Met 
625 630 
lie Glu Ala Gly Arg lie 
645 

Arg Leu Ala Leu His Val 
660 

Val Glu Ser Arg Pro Leu 
675 

Gly Lys Leu Gin Met Trp 
690 

Pro Gly Pro Pro Phe Asn 
705 710 
Leu Arg Cys lie lie Trp 
725 

Ser Leu Thr Gly Glu Lys 
740 

lie Gly Phe Glu Glu His 
755 

Leu Gly Gly Glu Gly Asn 
770 

Tyr Leu Pro Ala Glu Gin 
785 790 
Trp Arg Leu Asp Lys Thr 
805 

Gin He Trp Asp Asn Asp 
820 

Leu Gin Leu Asp Leu Asn 
835 

Lys Cys Ser Leu Asp Gin 
850 

Val Ser Leu Phe Glu Gin 
865 870 
Ala Glu Glu Gly Glu Lys 
885 

Leu Glu He Val Ala Glu 
900 

Gly Arg Asp Glu Pro Asn 
915 

Pro Asp Thr Ser Phe Leu 
930 

Phe lie Leu Trp Arg Arg 
945 950 
Leu Phe lie Leu Leu Leu 
965 

Asn Tyr Ala Ala Met Lys 
980 
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Phe 


Gly Lys 


Met 


Phe 


Glu 


535 










540 


Leu 


Lys 


He 


Thr 


Leu 
555 


Tyr 


He Gly Glu 


Thr 


Val 


Val 








570 






Gly Ala Arg 


Cys 


Gly 


Leu 






585 








Gin 


Trp Arg 


Asp 


Gin 


Leu 




600 










Gin 


Gin 


His 


Arg 


Val 


Lys 


615 










620 


Phe 


Gin 


Asp 


Lys 


Glu 
635 


Tyr 


Pro 


Asn 


Pro 


His 
650 


Leu 


Gly 


Leu 


Gin 


Gin 


Gin 


Gly 


Leu 






665 






Tyr 


Ser 


Pro 


Leu 


Gin 


Pro 


680 










Val 


Asp 


Leu 


Phe 


Pro 


Lys 


695 








700 


He 


Thr 


Pro 


Arg 


Arg 
715 


Ala 


Asn 


Thr 


Arg 


Asp 


Val 


He 






730 






Met 


Ser 


Asp 
745 


He 


Tyr 


Val 


Lys 


Gin 
760 


Lys 


Thr 


Asp 


Val 


Phe 


Asn 


Trp 


Arg 


Phe 


He 


775 










780 


Val 


Cys 


Thr 


He 


Ala 


Lys 








795 




Glu 


Ser 


Lys 


He 


Pro 


Ala 






810 






Lys 


Phe 


Ser 


Phe 


Asp 


Asp 




825 








Arq 


Met 
840 


Pro 


Lys 


Pro 


Ala 


Leu 


Asp 


Asp 


Ala 


Phe 


His 


855 










860 


Lvs 


Thr 


Val 


Lys 


Gly 


Trp 








875 




Lys 


He 


Leu 


Ala 


Gly 


Lys 






890 






Ser 


Glu 


His 
905 


Glu 


Glu 


Arg 


Met 


Asn 
920 


Pro 


Lys 


Leu 


Glu 


Trp 


Phe 


Thr 


Ser 


Pro 


Tyr 


935 










940 


Phe 


Arg 


Trp 


Ala 


He 
955 


He 


Phe 


Leu 


Ala 


He 
970 


Phe 


He 


Leu 


Val 


Lys 
985 


Pro 


Phe 


Ser 
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Leu 


Thr 


Cys 


Thr 


Asp 


Tyr 


Asp 


Leu 








560 


Asp 


Leu 


Glu 


Asn 




575 




Pro 


Gin 


Thr 


Tyr 




590 




Arg 


Pro 


Ser 


Gin 


605 








Ala 


Pro 


Val 


Tyr 


Ser 


He 


Glu 


Glu 










Pro 


Val 


Glu 


Glu 






655 




Val 


Pro 


Glu 


His 




670 






Asp 


He 


Glu 


Gin 


685 








Ala 


Leu 


Gly 


Arg 


Arg 


Arg 


Phe 


Phe 








720 


Leu 


Asp 


Asp 


Leu 






735 




Lys 


Gly 


Trp 


Met 




750 






His 


Tyr 


Arg 


Ser 


765 








Phe 


Pro 


Phe 


Asp 


Lys 


Asp 


Ala 


Phe 








OUU 


Arg 


Val 


Val 


Phe 




815 




Phe 


Leu 


Gly 


Ser 




830 






Lys 


Thr 


Ala 


Lys 


845 








Pro 


Glu 


Trp 


Phe 


Trp 


Pro 


Cys 


Val 










Leu 


Glu 


Met 


Thr 






895 




Pro 


Ala 


Gly 


Gin 




910 






Asp 


Pro 


Arg 


Arg 


925 








Lys 


Thr 


Met 


Lys 


Leu 


Phe 


He 


He 








960 


Tyr 


Ala 


Phe 


Pro 



975 
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